Transverse the directory structure using Python
Overview
Hi all, I remember how scared I was when I started coding for my MS project. But slowly I was able to solve a lot of coding problems. I wish there were some good resources that helped me in my project (Maybe there were resources but I was not good at googling then! 😛).
In most of my projects, the difficult part was to write a custom dataloader for fetching data stored with a different directory structure. Sometimes I needed to get the directory structure and store it in a text file for further processing. All such task requires transversing the data directory. Here I have a snippet for the same and it will make me happy that I was able to help some code.
# Insert the path to the data directory
root = "<dataset-root-path>"
# os.walk is a function that will transverse the directory. It is a
# generator that will transverse to all the sub-directories.
# Note: os.scandir is faster not I'm Not sure how fast as I never tried it.
for (dirpath, dirnames, filenames) in os.walk(root):
# dirpath: it the path for the (sub-)directory where os.walk is pointing
# dirnames: it is the list of sub-directories in dirpath
# filenames: it list all the files present in the "dirpath"
# Always check if there are any files in the the dirpath
# Proceed if there are any files, in case you are performing any operation
# on files
if len(filenames):
for name in filenames:
## <your-code-here> ##
# To get the absolute path of file use the following.
# Remember dirpath is the current sub-directory and filenames is
# a list of files in the current sub-directory.
absolute_path = os.path.join(dirpath, name)