# URLextractor

**Repository Path**: sslkk666/URLextractor

## Basic Information

- **Project Name**: URLextractor
- **Description**: Information gathering & website reconnaissance | https://phishstats.info/
- **Primary Language**: Shell
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2019-11-02
- **Last Updated**: 2020-12-19

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# URLextractor

Information gathering & website reconnaissance
------

**Usage:**
`./extractor http://www.hackthissite.org/`

![](https://github.com/eschultze/URLextractor/blob/master/examples/example1.png)

**Tips:**
* Colorex: put colors to the ouput `pip install colorex` and use it like `./extractor http://www.hackthissite.org/ | colorex -g "INFO" -r "ALERT"`
* Tldextract: is used by dnsenumeration function `pip install tldextract`

Features:
------

* IP and hosting info like city and country (using [FreegeoIP](http://freegeoip.net/))
* DNS servers (using [dig](http://packages.ubuntu.com/precise/dnsutils))
* ASN, Network range, ISP name (using [RISwhois](https://www.ripe.net/analyse/archived-projects/ris-tools-web-interfaces/riswhois))
* Load balancer test
* Whois for abuse mail (using [Spamcop](https://www.spamcop.net/))
* PAC (Proxy Auto Configuration) file
* Compares hashes to diff code
* robots.txt (recursively looking for hidden stuff)
* Source code (looking for passwords and users)
* External links (frames from other websites)
* Directory FUZZ (like Dirbuster and Wfuzz - using [Dirbuster](https://www.owasp.org/index.php/Category:OWASP_DirBuster_Project)) directory list)
* [URLvoid](http://www.urlvoid.com/) API - checks Google page rank, Alexa rank and possible blacklists 
* Provides useful links at other websites to correlate with IP/ASN
* Option to open ALL results in browser at the end

Changelog to version 0.2.0:
------

* [Fix] Changed GeoIP from freegeoip to ip-api
* [Fix/Improvement] Remove duplicates from robots.txt
* [Improvement] Better whois abuse contacts (abuse.net)
* [Improvement] Top passwords collection added to sourcecode checking
* [New feature] Firt run verification to install dependencies if need
* [New feature] Log file
* [New feature] Check for hostname on log file
* [New feature] Check if hostname is listed on Spamaus Domain Blacklist
* [New feature] Run a quick dnsenumeration with common server names

Changelog to version 0.1.9:
------

* Abuse mail using lynx istead of ~~curl~~
* Target server name parsing fixed
* More verbose about HTTP codes and directory discovery
* MD5 collection for IP fixed
* Links found now show unique URLs from array
* [New feature] **Google** results
* [New feature] **Bing** IP check for other hosts/vhosts
* [New feature] Opened ports from **Shodan**
* [New feature] **VirusTotal** information about IP
* [New feature] **Alexa Rank** information about $TARGET_HOST

Requirements:
------

Tested on Kali light mini AND OSX 10.11.3 with brew
```
sudo apt-get install bc curl dnsutils libxml2-utils whois md5sha1sum lynx openssl -y
```

**Configuration file:**
```
CURL_TIMEOUT=15 #timeout in --connect-timeout
CURL_UA=Mozilla #user-agent (keep it simple)
INTERNAL=NO #YES OR NO (show internal network info)
URLVOID_KEY=your_API_key #using API from http://www.urlvoid.com/
FUZZ_LIMIT=10 #how many lines it will read from fuzz file
OPEN_TARGET_URLS=NO #open found URLs at the end of script
OPEN_EXTERNAL_LINKS=NO #open external links (frames) at the end of script
FIRST_TIME=YES #if first time check for dependecies
```

Todo list:
------

* [x] Upload to github :)
* [x] Check for installed packages
* [ ] Integration with other APIs
* [ ] Export to CSV
* [ ] Integration with CipherScan

## Stargazers over time

[![Stargazers over time](https://starchart.cc/eschultze/URLextractor.svg)](https://starchart.cc/eschultze/URLextractor)