How does file system on UNIX find files? - unix

Suppose that a request is made to ls somefile. How does the file system in UNIX handle this request from algorithmic perspective? Is that a O(1) query or O(log(N)) depending on files say starting in current directory node, or is it a O(N) linear search, or is that a combination depending on some parameters?

It can be O(n). Classic Unix file systems, based on the old school BSD fast file system and the like, store files as inode numbers, and their names are assigned at the directory level, not at the file level. This allows you have to the same file present in multiple locations at the same time, via hard links. As such, a "directory" in most Unix systems is just a file that lists filenames and inode numbers for all the files stored "in" that directory.
Searching for a particular filename in a directory just means opening that directory file and parsing through it until you find the filename's entry.
Of course, there's many different file systems available for Unix systems these days, and some will have completely differnet internal semantics for finding files, so there's no one "right" answer.

Its O(n) since the file systems has to read it off phyical media initially, but Buffer Caches will increase that significantly based on the Virtual File System (VFS) implementation on your flavor of *nix. (Notice how the first time you access a file its slower than the second time you execute the exact same command?)
To learn more read IBM's article on the Anatomy of the Unix file system.

Typical flow for a program like ls would be
Opendir on the current path.
Readdir for the current path.
Filter the entries returned by the OpenDir through filter provided on the command line. So typically O(n)
This is the generic flow, however there are many optimizations in place for special and frequent cases (;like caching of inode numbers of recent and frequent paths.
Also it depends on how directoy file are organized. In unix it is based on time of creation forcing to read every entry and increasing the look-up time to O(n). In NTFS equivalent of directory files are sorted based on name.

I can't answer your question. Maybe if you take a peak into the source code, you could answer your question yourself and explain us how it works.
ls.c
ls.h

Related

What is the pro's and con's of locking the actual file vs an empty lock file?

My program is writing to a binary file, and there could be multiple instances of the program accessing the same binary file for the same user. In Unix/Linux, I see some programs (particularly daemon processes) locking an empty lock file instead of the actual shared data that needs to be locked (so instead of locking ~/.data/foo they lock ~/.data/foo.lck). What are the pros and cons of locking the actual file vs an empty lock file?
flock is not supported over NFS or other network file systems for all version of unix (it wasn't even supported by Linux until 2.6.12). On the other hand O_CREAT|O_EXCL is much more reliable over many more file systems, and has been so for much longer.
Even on systems that do support flock on network filesystems (or cases where you don't need that flexibility), O_CREAT|O_EXCL together with flock is very useful because it distinguishes between a clean shutdown and a non-clean shutdown. flock helpfully goes away automatically, but it also, unhelpfully, doesn't distinguish why it went away.
flocking the file itself prevents atomic writes (copy, erase old, rename), or any other case where you might erase the existing file. Sometimes "the actual file" doesn't always have the same inode over the entire run of the program. So a separate file is much more convenient in those cases as well. This is very common in those foo.lck cases, because often you're locking foo for a short period of time, and might erase it in the process.
I see three cons of an empty lock file:
The user permissions of the directory should allow you to create a file.
In case of disk space issues, this might fail.
In case your program crashes, the lockfile is still present.
I see one con of modifying the actual file's name:
In case your program crashes, your file has been altered (only the filename, but it might generate confusion).
Obviously, I see one big advantage of the empty lock file:
your original file does not change at all.
By the way, I believe this question is better suited for the SoftwareEngineering community.

Overcoming inode limitation

What's the best practise for storing a large (expanding) number of small files on a server, without running into inode limitations?
For a project, I am storing a large number of small files on a server with 2TB HD space, but my limitation is the 2560000 allowed inodes. Recently the server used up all the inodes and was unable to write new files. I subsequently moved some files into databases, but others (images and json files remain on the drive). I am currently at 58% inode usage, so a solution is required imminently.
The reason for storing the files individually is to limit the number of database calls. Basically the scripts will check if the file exists and if so, then return results dependently. Performance wise this makes sense for my application, but as stated above it has limitations.
As I understand it does not help to move the files into sub-directories, because each inode points to a file (or a directory file), so in fact I would just use up more inodes.
Alternatively I might be able to bundle the files together in an archive type of file, but that will require some sort of indexing.
Perhaps I am going about this all wrong, so any feedback is greatly appreciated.
On the advice of arkascha I looked into loop devices and found some documentation about losetup. Remains to be tested.

Difference between unlink and rm on unix

Whats the real different between these two commands? Why is the system call to delete a file called unlink instead of delete?
You need to understand a bit about the original Unix file system to understand this very important question.
Unlike other operating systems of its era (late 60s, early 70s) Unix did not store the file name together with the actual directory information (of where the file was stored on the disks.) Instead, Unix created a separate "Inode table" to contain the directory information, and identify the actual file, and then allowed separate text files to be directories of names and inodes. Originally, directory files were meant to be manipulated like all other files as straight text files, using the same tools (cat, cut, sed, etc.) that shell programmers are familiar with to this day.
One important consequence of this architectural decision was that a single file could have more than one name! Each occurrence of the inode in a particular directory file was essentially linking to the inode, and so it was known. To connect a file name to the file's inode (the "actual" file,) you "linked" it, and when you deleted the name from a directory you "unlinked" it.
Of course, unlinking a file name did not automatically mean that you were deleting / removing the file from the disk, because the file might still be known by other names in other directories. The Inode table also includes a link count to keep track of how many names an inode (a file) was known by; linking a name to a file adds one to the link count, and unlinking it removes one. When the link count drops down to zero, then the file is no longer referred to in any directory, presumed to be "unwanted," and only then can it be deleted.
For this reason the "deletion" of a file by name unlinks it - hence the name of the system call - and there is also the very important ln command to create an additional link to a file (really, the file's inode,) and let it be known another way.
Other, newer operating systems and their file systems have to emulate / respect this behavior in order to comply with the Posix standard.

Executing 'mv A B': Will the 'inode' be changed?

If we execute a command:
mv A B
then what will happen to the fields in the inode of file A? Will it change?
I don't think that it should change just by changing the name of the file, but I'm not sure.
It depends at least partially on what A and B are. If you're moving between file systems, the inode will almost certainly be different.
Simply renaming the file on the same system is more likely to keep the same inode simply because the inode belongs to the data rather than the directory entry and efficiency would lead to that design. However, it depends on the file system and is in no way mandated by standards.
For example, there may be a versioning file system with the inode concept that gives you a new inode because it wants to track the name change.
It depends.
There is a nice example on this site which shows that the inode may stay the same. But I would not rely on this behaviour, I doubt that it is specified in any standard.

What makes the Unix file system more superior to the Windows file system?

I'll admit that I don't know the inner workings of the unix operating system, so I was hoping someone could shed some light on this topic.
Why is the Unix file system better than the windows file system?
Would grep work just as well on Windows, or is there something fundamentally different that makes it more powerful on a Unix box?
e.g. I have heard that in a Unix system, the number of files in a given directory will not slow file access, while on Windows direct file access will degrade as the # of files increase in the given folder, true?
Updates:
Brad, no such thing as the unix file system?
One of the fundamental differences in filesystem semantics between Unix and Windows is the idea of inodes.
On Windows, a file name is directly attached to the file data. This means that the OS prevents somebody from deleting a file that is currently open. On some versions of Windows you can rename a file that is currently open, and on some versions you can't.
On Unix, a file name is a pointer to an inode, which is the place the file data is actually stored. This has a couple of implications:
You can have two different filenames that refer to the same underlying file. This is often called a hard link. There is only one copy of the file data, so changes made through one filename will appear in the other.
You can delete (also known as unlink) a file that is currently open. All that happens is the directory entry is removed, but this doesn't affect any other process that might still have the file open. The process with the file open hangs on to the inode, rather than to the directory entry. When the process closes the file, the OS deletes the inode because there are no more directory entries pointing at it and no more processes with the inode open.
This difference is important, but it is unrelated to things like the performance of grep.
First, there is no such thing as "the Unix file system".
Second, upon what premise does your argument rest? Did you hear someone say it was superior? Perhaps if you offered some source, we could critique the specific argument.
Edit: Okay, according to http://en.wikipedia.org/wiki/Comparison_of_file_systems, NTFS has more green boxes than both UFS1 and UFS2. If green boxes are your measure of "better", then NTFS is "better".
Still a stupid question. :-p
I think you are a little bit confused. There is no 'Unix' and 'Windows' file systems. The *nix family of filesystems include ext3, ZFS, UFS etc. Windows primarily has had support for FAT16/32 and their own filesystem NTFS. However today linux systems can read and write to NTFS. More filesystems here
I can't tell you why one could be better than the other though.
I'm not at all familiar with the inner workings of the UNIX file systems, as in how the bits and bytes are stored, but really that part is interchangeable (ext3, reiserfs, etc).
When people say that UNIX file systems are better, they might mean to be saying, "Oh ext3 stores bits in such as way that corruption happens way less than NTFS", but they might also be talking about design choices made at the common layer above. They might be referring to how the path of the file does not necessarily correspond to any particular device. For example, if you move your program files to a second disk, you probably have to refer to them as "D:\Program Files", while in UNIX /usr/bin could be a hard drive, a network drive, a CD ROM, or RAM.
Another possibility is that people are using "file system" to mean the organization of paths. Like, for instance, how Windows generally likes programs in "C:\Program Files\CompanyName\AppName" while a particular UNIX distribution might put most of them in /usr/local/bin. In the later case, you can access much more of your system readily from the command line with a much smaller PATH variable.
Also, since you mentioned grep, if all the source code for system libraries such as the kernel and libc is stored in /usr/local/src, doing a recursive grep for a particular error message coming from the guts of some system library is much simpler than if things were laid out as /usr/local/library-name/[bin|src|doc|etc]. If you already have an inkling of where you're searching, though, cygwin grep performs quite well under Windows. In fact, I find for full-text searching I get better results from grep than the search facilities built into Windows!
well the *nix filesystems do a far better job of actual file managment than fat16/32 or NTFS. The *nix systems try to prevent the need for a defrag over windows doing...nothing? Other than that I don't really know what would make one better than the other.
There are differences in how Windows and Unix operating systems expose the disk drives to users and how drive space is partitioned.
The biggest difference between the two operating systems is that Unix essentially treats all of the physical drives as one logical drive. (This isn't exactly how it works, but should give a good enough picture.) This allows a much simpler file system from the users perspective as there are no drive letters to deal with. I have a folder called /usr/bin that could span multiple physical drives. If I need to expand that partition I can do so by adding a new drive, remapping the folder, and moving the files. (Again, somewhat simplified, but it gets the point across.)
The other difference is that when you format a drive, a certain amount is set aside (by default, as an admin you can change the size to 0 if you want) for use by the "root" account (admin account) which allows an admin to almost always be able to log in to the machine even when the user has filled the disk and is receiving "out of disk space" messages.
One simple answer:
Windows is a proprietary which means no one can see it's code except windows, while unix/linux are open-source. So as it is open-source many brighter minds have contributed towards the filesystem making it one of the robust and efficient, hence effective commands like grep come to our rescue when needed truly.
I don't know enough about the guts of the file systems to answer the first, except when I read the first descriptions of NTFS it sounded an awful lot like the Berkley Fast Filesystem.
As for the second, there are plenty of greps for Windows. When I had to use Windows in the past, I always installed Cygwin first thing.
The answer turns out to have very little to do with the filesystem and everything to do with the filesystem access drivers.
In particular, the implementation of NTFS on Windows is very slow compared to ext2/ext3. Also on Windows, "can't delete file in use" even though NTFS should be able to support it.

Resources