Searching Filesystems

From csn
Revision as of 06:23, 13 February 2020 by David (talk | contribs) (Created page with "These following are some notes on how to search for files on filesystems. Use these to search the Gutenberg Archive provided. === Searching for filenames === To search for a...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

These following are some notes on how to search for files on filesystems. Use these to search the Gutenberg Archive provided.

Searching for filenames

To search for a filename containing certain characters you can use

find /path/to/where/you/search/from -name "*.extension"

Searching for text

To search for text within a certain structure you can adapt the following.

find /path/to/where/you/search/from -type f -exec grep -H 'text-to-find-here' {} \;

Or you can use grep:

grep -r "string" /path

To show the lines surrounding the string match:

grep -r -C 3 foo README.txt

Modification and creation dates

To search for the most recently modified file:

find $1 -type f -exec stat --format '%Y :%y %n' "{}" \; | sort -nr | cut -d: -f2- | head

To search for the oldest creation date:

find /path/to/where/you/search/from -type f -printf '%T+ %p\n' | sort | head -n 20

To find a file of a certain size for example 68 bytes

find /path/to/where/you/search/from -type f -size 68c -exec ls {} \;

To find files 512k you could use:

find /path/to/where/you/search/from -type f -size +512k -exec ls -lh {} \;

To find the largest files in the filesystem

du -a /path/to/where/you/search/from | sort -n -r | head -n 20

Investigating the frequency of elements in a file

I use the following on the command line to look for frequent elements. You need to use your brain to filter the signal from the noise but it can be useful to identify uncommonly frequent IP addresses, MAC addresses and usernames et cetera.

sed -e 's/\s/\n/g' < file_of_interest.txt | sort | uniq -c | sort -nr | head  -200

Questions

Please highlight the text below for spoilers/answers.

  1. How many times does the string "verdigris" appear, enter a number only: 9
  2. What is the surname of the author of the filename “1107.txt”, the answer is case sensitive: Shakespeare
  3. What is the surname of the book author, of the file that is exactly 255258 bytes. The answer is case sensitive: Lobo
  4. What is the filename of the file with the 3rd oldest creation date: 1499.txt
  5. Find the word that follows the follows the text “Next day there was a surprise for Jack”: Halliday (Case sensitive-no spaces)