Linux File System, Process and Open File Table [closed] - unix

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm a little bit confused about process and open file tables.
I know that if 2 processes try to open the same file, there will be 2 entries in the open file table. I am trying to find out the reason for this.
Why there are 2 entries created in the open file table when 2 different processes try to reach the same file? Why it can't be done with 1 entry?

I'm not quite clear what you mean by "file tables". There are no common structures in the Linux kernel referred to as "file tables".
There is /etc/fstab, which stands for "filesystem table", which lists filesystems which are automatically mounted when the system is booted.
The "filetable" Stack Overflow tag that you included in this question is for SQL Server and not directly connected with Linux.
What it sounds like you are referring to when you talk about open files is links. See Hard and soft link mechanism. When a file is open in Linux, the kernel maintains what is basically another hard link to the file. That is why you can actually delete a file that is open and the system will continue running normally. Only when the application closes the file will the space on the disk actually be marked as free.
So for each inode on a filesystem (an inode is generally what we think of as a file), there are often multiple links--one for each entry in a directory, and one for each time an application opens the file.
Update: Here is a quote from the web page that inspired this question:
Each file table entry contains information about the current file. Foremost, is the status of the file, such as the file read or write status and other status information. Additionally, the file table entry maintains an offset which describes how many bytes have been read from (or written to) the file indicating where to read/write from next.
So, to directly answer the question, "Why there are 2 entries created in the open file table when 2 different processes try to reach the same file?", 2 entries are required because they may contain different information. One process may open the file read-only while the other read-write. And the file offset (position within the file) for each process will almost certainly be different.

Related

How to access and edit Rprofile? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
New to R. How does one access and edit Rprofile?
On start-up R will look for the R-Profile in the following places:
(https://csgillespie.github.io/efficientR/set-up.html)
R_HOME: the directory in which R is installed.
Find out where your R_HOME is with the R.home() command.
HOME, the user’s home directory. Can ask R where this is with, path.expand("~")
R’s current working directory. This is reported by getwd().
Note although there maybe different R-Profile files R will only use one in a given session. The preference order is:
Current project>Home>R_Home
To create a project-specific start-up script create a .Rprofile file in the project’s root directory.
You can access and edit the different .Rprofile files via
file.edit(file.path("~", ".Rprofile")) # edit .Rprofile in HOME
file.edit(".Rprofile") # edit project specific .Rprofile
There is information about the options you can set via
help("Rprofile")
As mentioned the link above does provide additional details but the points outlined above should show you where the files are and how to access them.

webdav access a textfile line by line

I have looked all over (spent about 7 hours). I have found numerous articles on how to map a drive (google drive, onedrive etc). What I cannot seem to find an answer to is this: Once I have mapped the drive can I use the files on that drive just like I use files on a server. Open the file, read a record, write a record. I have created a file, mapped a network drive, wrote records to the file and retrieved records from the file. I have a home grown database that is implemented with a large binary (as opposed to text) file. I have to go to a byte position and read a fixed number of bytes. If WebDAV is copying the file to my computer and then writing it back this would make my file access way to slow and I cannot seem to find an answer. Some programmers I have talked to say I cannot even do that, yet I can. Any direction would be very much appreciated.
Charlie
That's likely because standard WebDAV doesn't allow updating ranges of resources only, so the whole thing needs to be written back.

Couple questions about using Rsync? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I can't find any reliable file syncing program for my Mac, so I have been using the command line Rsync between two folders.
I have been using "rsync -r source destination".
-Does this sync files both ways, or only sync the source to the destination?
-If a file was previously synced between the two folders, but deleted because it is no longer needed, does it get deleted on both the source and destination, or will it just always get copied to where it is missing from?
No, rsync will synchronise the contents of a remote directory to a local directory. In that respect it is one-way. Optionally you can force it to delete local files that no longer exist in the remote folder.
If you want to keep the most recent changes on both machines, you would have to supply a more complicated rsync incantation and set up both machines as rsync servers. I imagine doing so will get you into trouble eventually, especially if you want to be authoritarian over deletion.
In any case, you can use the -u (or --update) option which will skip any files that are newer on the destination end. You do have to worry about the timestamps, and this will not handle any conflicts or merges. Still... It may be as simple as:
rsync -u -r target1 target2
rsync -u -r target2 target1
That won't do anything about deletion. You have no way of knowing that a missing file on one target was deleted there instead of a new file having been created on the other target.
This is why version control was invented... And for people who are scared of version control, services like DropBox exist.
Answering original question :
1)It synchronizes files only in one direction depending on pull or push mechanism .
for push and pull mechanism , see manual page by ""man rsync"".
so, for the rest of your question don't assume that it works in both
ways.
2)The file only gets deleted on destination directory.
get more details on this in rsync --help command ,see option
--delete which delete extraneous files from destination dirs
and others options for delete.
3) The missing files will be copied just to destination directory if you are pushing files on remote machine/directory/
a sample example for push mechanism :-
rsync -avz /home/local_dir/abc.txt remoteuser#192.168.xx.xx:/home/remoteuser/
if a file named abc.txt is already present in destination directory then it will be updated depending upon it is old version of abc.txt on local side or not.
And if abc.txt is not present on remote directory , a total new file named abc.txt will be created with contents of local version of abc.txt
a sample example for pull mechanism :-
rsync -avz remoteuser#192.168.xx.xx:/home/remoteuser/abc.txt /home/local_dir/
if a file named abc.txt is already present in local directory then it will be updated depending upon it is old version of abc.txt on remote side or not.
And if abc.txt is not present on local directory , a total new file named abc.txt will be created with contents of local version of abc.txt

Unix .pid file created automatically? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Reference for proper handling of PID file on Unix
Is the .pid file in unix created automatically by the operating system whenever a process runs OR should the process create a .pid file programmatically like "echo $$ > myprogram.pid" ?
The latter holds true -- a process must create the .pid file (the OS won't do it). pid files are often used by daemon processes for various purposes. For example, they can be used to prevent a process from running more than once. They can also be used so that control processes (e.g. apache2ctl) know what process to send signals to.

Strategy for handling user input as files

I'm creating a script to process files provided to us by our users. Everything happens within the same UNIX system (running on Solaris 10)
Right now our design is this
User places file into upload directory
Script placed on cron to run every 10 minutes.
Script looks for files in upload directory, processes them, deletes immediately afterward
For historical/legacy reasons, #1 can't change. Also, deleting the file after processing is a requirement.
My primary concern is concurrency. It is very likely that the situation will arise where the analysis script runs while an input file is still being written to. In this case, data will be lost and this (obviously) unacceptable.
Since we have no control over the user's chosen means of placing the input file, we cannot require them to obtain a file lock. As I understand, file locks are advisory only on UNIX. Therefore a user must choose to adhere to them.
I am looking for advice on best practices for handling this problem. Thanks
Obviously all the best solutions involve the client providing some kind of trigger indicating that it has finished uploading. That could be a second file, an atomic move of the file to a processing directory after writing it to a stage directory, or a REST web service. I will assume you have no control over your clients and are unable or unwilling to change anything about them.
In that case, you still have a few options:
You can use a pretty simple heuristic: check the file size, wait 5 seconds, check the file size. If it didn't change, it's probably good to go.
If you have super-user privileges, you can use lsof to determine if anyone has this file open for writing.
If you have access to the thing that handles upload (HTTP, FTP, a setuid script that copies files?) you can put triggers in there of course.

Resources