79633695

Date: 2025-05-22 12:07:37
Score: 1.5
Natty:
Report link

How to Search Git History for Specific Lines in Diffs with Commit Details (Author, Date)

It's often necessary to find not just which commit changed a certain keyword, but also the specific file, whether the keyword was added or removed, the actual line of text, and details like the commit author and date. Standard git log -S or git log -G can find commits, but getting this detailed, line-specific output requires a bit more work.

This tutorial provides a shell script that does exactly that.

The Problem

You want to search your entire Git history for commits where a specific keyword appears in the added or removed lines of a file's diff. For each match, you need to see:

The Solution: A Shell Script

This script iterates through your Git history, inspects diffs, and formats the output as described.

Bash

#!/bin/sh

# 1. Check if an argument (the keyword) was passed to the script
if [ "$#" -ne 1 ]; then
  echo "Usage: $0 <keyword>"
  echo "Error: Please provide a keyword to search for."
  exit 1
fi

# Use the first argument from the command line as the keyword
KEYWORD="$1"

# The Grep pattern:
#   ^[+-]      : Line starting with '+' (addition) or '-' (deletion).
#   .* : Followed by any character (can be empty).
#   $KEYWORD   : The keyword itself (as a substring).
GREP_PATTERN='^[+-].*'"$KEYWORD"

echo "Searching for commits containing '$KEYWORD' in diffs..."

# 2. Find commits where the keyword appears in ANY modification of the commit.
#    git log -G uses the KEYWORD as a regex.
git log --all --pretty="format:%H" -G"$KEYWORD" | while IFS= read -r commit_id; do
  # Get the author and date for this commit_id
  # %an = author name
  # %ad = author date. --date=short gives a format YYYY-MM-DD.
  commit_author_name=$(git show -s --format="%an" "$commit_id")
  commit_author_date=$(git show -s --format="%ad" --date=short "$commit_id")

  # 3. For each found commit, list the files that have been modified.
  git diff-tree --no-commit-id --name-only -r "$commit_id" | while IFS= read -r file_path; do
    # Ensure file_path is not an empty string.
    if [ -n "$file_path" ]; then
      # 4. Get the diff for THIS specific file IN THIS specific commit.
      #    Then, `grep` (with -E for extended regex) searches for the keyword 
      #    in the added/deleted lines.
      git show --pretty="format:" --unified=0 "$commit_id" -- "$file_path" | \
      grep --color=never -E "$GREP_PATTERN" | \
      while IFS= read -r matched_line; do
        # 5. For each corresponding line, determine the type (ADDITION/DELETION)
        #    and extract the text of the line.

        change_char=$(echo "$matched_line" | cut -c1)
        line_text=$(echo "$matched_line" | cut -c2-) # Text from the second character onwards

        change_type=""
        if [ "$change_char" = "+" ]; then
          change_type="[ADDITION]"
        elif [ "$change_char" = "-" ]; then
          change_type="[DELETION]"
        else
          change_type="[???]" # Should not happen due to the GREP_PATTERN
        fi

        # 6. Display the collected information, including the date and author
        echo "$commit_id [$commit_author_date, $commit_author_name] $file_path $change_type: $line_text"
      done
    fi
  done
done

echo "Search completed for '$KEYWORD'."

How it Works

  1. Argument Parsing: The script first checks if exactly one argument (the keyword) is provided. If not, it prints a usage message and exits.

  2. Initial Commit Search: git log --all --pretty="format:%H" -G"$KEYWORD" searches all branches for commits where the diff's patch text contains the specified KEYWORD. The -G option treats the keyword as a regular expression. It outputs only the commit hashes (%H).

  3. Author and Date Fetching: For each commit_id found, git show -s --format="%an" and git show -s --format="%ad" --date=short are used to retrieve the author's name and the authoring date (formatted as YYYY-MM-DD), respectively. The -s option suppresses diff output, making these calls efficient.

  4. File Iteration: git diff-tree --no-commit-id --name-only -r "$commit_id" lists all files modified in the current commit.

  5. Diff Inspection: For each modified file, git show --pretty="format:" --unified=0 "$commit_id" -- "$file_path" displays the diff (patch) for that specific file within that commit.

  6. Line Matching: The output of git show is piped to grep --color=never -E "$GREP_PATTERN".

    • GREP_PATTERN (^[+-].*'"$KEYWORD"') searches for lines starting with + or - (indicating added or removed lines) that contain the KEYWORD.

    • --color=never ensures that grep doesn't output color codes if it's aliased to do so, which would interfere with text parsing.

    • -E enables extended regular expressions for the pattern.

  7. Line Processing: Each matching line found by grep is processed:

    • The first character (+ or -) is extracted using cut -c1 to determine if it's an [ADDITION] or [DELETION].

    • The rest of the line text (after the +/-) is extracted using cut -c2-.

  8. Output: Finally, all the collected information (commit ID, date, author, file path, change type, and line text) is printed to the console.

How to Use

  1. Save the Script: Copy the script above into a new file in your project directory or a directory in your PATH. Let's name it git_search_diff.sh.

  2. Make it Executable: Open your terminal, navigate to where you saved the file, and run:

    Bash

    chmod +x git_search_diff.sh
    
    
  3. Run the Script: Execute the script from the root directory of your Git repository, providing the keyword you want to search for as an argument.

    Bash

    ./git_search_diff.sh "your_keyword_here"
    
    

    For example, to search for the keyword "API_KEY":

    Bash

    ./git_search_diff.sh "API_KEY"
    
    

    Or to search for "Netflix":

    Bash

    ./git_search_diff.sh "Netflix"
    
    

Example Output

The output will look something like this:

Searching for commits containing 'your_keyword_here' in diffs...
abcdef1234567890abcdef1234567890abcdef12 [2023-05-15, John Doe] src/example.js [ADDITION]: + // TODO: Integrate your_keyword_here for new feature
fedcba0987654321fedcba0987654321fedcba09 [2022-11-01, Jane Smith] config/settings.py [DELETION]: - OLD_API_KEY_FORMAT_WITH_your_keyword_here = "..."
...
Search completed for 'your_keyword_here'.

Notes and Customization

This script provides a powerful way to pinpoint exactly where and how specific keywords were introduced or removed in your project's history, along with valuable contextual information.


Reasons:
  • Blacklisted phrase (1): This tutorial
  • Blacklisted phrase (1): ???
  • Whitelisted phrase (-2): Solution:
  • RegEx Blacklisted phrase (2.5): Please provide
  • Long answer (-1):
  • Has code block (-0.5):
  • Starts with a question (0.5): How to
Posted by: Sébastien