# GitHub Recon

Theory

Online repositories of code hold a window into an organization's technology stack, revealing the programming languages and frameworks they employ. In some rare instances, developers have unintentionally exposed sensitive information, including critical data and credentials, within public repositories. These inadvertent revelations may present a unique opportunity us.

## Practice

### Github Dorks & Sensitive Data Exposure

To automate the process of searching sensitives files and hardcoded credentials in **Git repositories**, we may use following tools

{% tabs %}
{% tab title="Github Dorks" %}
[Github-dorks](https://github.com/techgaun/github-dorks) is a python tools used to search leaked secrets via github search. Its collection of Github dorks can reveal sensitive personal and/or organizational information such as private keys, credentials, authentication tokens, etc.

```bash
# search a single repo
github-dork.py -r techgaun/github-dorks

# search all repos of a user
github-dork.py -u techgaun  

# search all repos of an organization
github-dork.py -u dev-nepal
```

Alternatively, we can manualy search for specific dorks, without using [Github-dorks](https://github.com/techgaun/github-dorks) :

<figure><img src="/files/0Kh8jxX4HYnM38J4HMUy" alt=""><figcaption></figcaption></figure>

Examples of Github Dorks are :

| Dork                                          | Description                                          |
| --------------------------------------------- | ---------------------------------------------------- |
| filename:.npmrc \_auth                        | npm registry authentication data                     |
| filename:.dockercfg auth                      | docker registry authentication data                  |
| extension:pem private                         | private keys                                         |
| extension:ppk private                         | puttygen private keys                                |
| filename:id\_rsa or filename:id\_dsa          | private ssh keys                                     |
| filename:wp-config.php                        | wordpress config files                               |
| filename:.env MAIL\_HOST=smtp.gmail.com gmail | smtp configuration (try different smtp services too) |
| shodan\_api\_key language:python              | Shodan API keys (try other languages too)            |
| /"sk-\[a-zA-Z0-9]{20,50}"/ language:Shell     | Open AI API Keys                                     |
| "api\_hash" "api\_id"                         | Telegram API token                                   |
| {% endtab %}                                  |                                                      |

{% tab title="GitHound" %}
[GitHound](https://github.com/tillson/git-hound) hunts down exposed API keys and other sensitive information on GitHub using GitHub code search, pattern matching, and commit history searching.

```bash
# Basic Usage
git-hound --subdomain-file subdomains.txt
echo "\"example.com\"" | git-hound

# Searching for exposed API keys
echo "api.halcorp.biz" | githound --dig-files --dig-commits --many-results --rules halcorp-api-regexes.txt --results-only | python halapitester.py

# Bug Bounty Hunters: Searching for leaked employee API tokens
echo "\"uberinternal.com\"" | githound --dig-files --dig-commits --many-results --languages common-languages.txt --threads 100
```

{% endtab %}

{% tab title="Noseyparker" %}
[Noseyparker](https://github.com/praetorian-inc/noseyparker) is a command-line program that finds secrets and sensitive information in textual data and Git history.

```bash
# Scan a repo
noseyparker scan --datastore np.myDataStore --git-url <repo-url>

# Scan all repo of an user
noseyparker scan --datastore np.myDataStore --github-user <username>

# Scan all repo of an organization
noseyparker scan --datastore np.myDataStore --github-organization <NAME>

# Show result of a scan
noseyparker report -d np.myDataStore
```

{% endtab %}

{% tab title="GitHunt" %}
[GitHunt](https://github.com/v4resk/GitHunt) is a (Python) tool for detecting sensitive data exposure in GitHub repositories, leveraging GitHub's search functionality.

```bash
# See available hunting modules
python GitHunt.py hunt -h

# Hunt for OpenAI API Keys
python GitHunt.py hunt -m OpenAI

# Export all valid OpenAI API keys found in a json 
python GitHunt.py db -m OpenAI -f json -o ~/export.json
```

{% endtab %}

{% tab title="Gitleaks" %}
[Gitleaks](https://github.com/gitleaks/gitleaks) (Go) is a SAST tool for **detecting** and **preventing** hardcoded secrets like passwords, api keys, and tokens in git repos.

```bash
./gitleaks detect -v -r=<GIT_REPO_URL>
```

{% endtab %}

{% tab title="Gitrob" %}
[Gitrob](https://github.com/michenriksen/gitrob) (Go) is a tool to help find potentially sensitive files pushed to public repositories on Github. It will clone repositories belonging to a user or organization down to a configurable depth and iterate through the commit history and flag files that match signatures for potentially sensitive files.

{% hint style="info" %}
Gitrob will need a Github access token in order to interact with the Github API. See [Create a personal access token](https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/).
{% endhint %}

```bash
# Run it !
# With <TARGET> an organization/user profile (i.e v4resk)
gitrob -github-access-token <TOKEN> <TARGET> 
```

{% endtab %}
{% endtabs %}

## Resources

{% embed url="<https://github.com/techgaun/github-dorks>" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://red.infiltr8.io/redteam/recon/open-source-code.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
