I'm trying to read a csv file but i get : No such file or directory.
the file is on tmp folder.
This is the commands:
Your file is not at hdfs:///user/hdfs/titles.csv, and this is what the error is saying.
You are only showing ls, not hdfs dfs -ls, so you should be using just cat titles.csv
If you want to read a file from HDFS, you need to hdfs dfs -put titles.csv /user/hdfs/ first. (And create the user directory using hdfs dfs -mkdir -p /user/hdfs if it doesn't already exist)
Related
Trying to copy files from hdfs to local machine using copyToLocal with the following command:
Hadoop fs -copyToLocal remote path(hdfs file) destinationPath (my local path)
But I am getting the following error:
No such file or directory: error
Please help me with this.
You can copy the data from hdfs to the local filesystem by following two ways:
bin/hadoop fs -get /hdfs/source/path /localfs/destination/path
bin/hadoop fs -copyToLocal /hdfs/source/path /localfs/destination/path
Another alternative way would be:
Download the file from hdfs to the local filesystem. Just, point your web browser to HDFS WEBUI(namenode_machine:50070) and select the file and download it.
I have several files in a unix directory that I have to move to Hadoop. I know the copyFromLocal command:
Usage: hadoop fs -copyFromLocal URI but that allows me to
move one by one.
Is there any way to move all those files to the HDFS in one command?
I want to know if there is a way to transfer several files at once
put command will work
if you want to copy whole directory from local to hdfs
hadoop fs -put /path1/file1 /pathx/target/
if you want to copy all files from directory to hdfs in one go
hadoop fs -put /path1/file1/* /pathx/target/
The put command supports multiple sources
Copy single src, or multiple srcs from local file system to the destination file system
I want to know how can we convert .xlsx file residing in hdfs to .csv file using R script.
I tried using XLConnect and xlsx packages, but its giving me error 'file not found'.I am providing HDFS location as input in the R script using the above packages.I am able to read .csv files from hdfs using R script (read.csv()).
Do I need to install any new packages for reading .xlsx present in hdfs .
sharing the code i used:
library(XLConnect)
d1=readWorksheetFromFile(file='hadoop fs -cat hdfs://............../filename.xlsx', sheet=1)
"Error: FileNotFoundException (Java): File 'filename.xlsx' could not be found - you may specify to automatically create the file if not existing."
I am sure the file is present in the specified location.
Hope my question is clear. Please suggest a method to resolve it.
Thanks in Advance!
hadoop fs isn't a file, but a command that copies a file from HDFS to your local filesystem. Run this command from outside R (or from inside it using system), and then open the spreadsheet.
I am trying to zip file which is in the format of Amazon*.xls in unix and also remove the source file after compression.Below is the used command
zip -m Amazon`date +%Y-%m-%d:%H:%M:%S`.zip Amazon*.xls
For the above command i am getting below error
zip I/O error: No such file or directory
zip error: Could not create output file Amazon.zip
PS: GZIP is working fine. I need zip format files.
It is not the zip, it is how your shell deals with expanding/substituting variables. Two lines solution for bash
export mydate=`date +%Y-%m-%d:%H:%M:%S`
zip -m Amazon_$mydate.zip *matrix*
Execute by hand (few secs difference) or better put in a shell script myzipper.sh and just source it.
Use '-p' instead of '-m', if zip files are to be extracted on Windows OS.
export mydate=date +%Y-%m-%d:%H:%M:%S
zip -p Amazon_$mydate.zip matrix
I have zip files in HDFS.I am going to write a mapreduce program in R. Now R is having command to unzip the zip file.
unzip("filepath")
but here it is not accepting my HDFS file path? I tried like
unzip(hdfs.file("HDFS file path"))
it is throwing error..
invalid path argument..
Is there any way to give HDFS file path to my R commands?