How to create a column by setting the condition - r

I am currently working on dataset with different firms. I have each firms' longitude and latitude. I want to find the firms' city locations by using R.
For example, I found that Shanghai's city longitude and latitude range 120.852326~122.118227 and 30.691701~31.874634 respectively.
I firstly want to create a column named "city", and I want to use find if firms' longitudes and latitudes within Shanghai's city longitude and latitude range. If yes, then R will print "Shanghai" in the "city column if not, it will remain NA.
In my dataframe longitude and latitude variables are displayed as "longitude" and "latitude".
I am not sure how to run the code and I am really appreciate your favor and help!
I am really struggling at the beginning. Your help and favor are highly appreciative!

Related

Perform join using latitude and longitude in R

I have an excel spreadsheet with the latitude and longitude of bike docking stations.
I have a shape file in R (cb_2018_17_bg_500k.shp) that has GEOID (12 digit FIPS code) column and a column labelled geometry. The values in this column are POLYGON((longitude,latitude))
I am trying to add a column in the excel spreadsheet titled FIPS. So, I need to somehow join the latitude and longitude to GEOID column in the shape file.
I am a novice when it comes to R.
Any advice will be much appreciated.
Rich
So far, I have only managed to upload the shape file to R.

How to Calculate the normalised data of column using population?

London City number Data
Population of London city numbers is 100,000. How can I calculate the normalized data using population? I have searched and spent a lot of time to find out any clue but failed. I find ways to normalize data without using population number but did not find any way to normalize the data using population number. Can anyone help me, please?

Merging data by longitude and latitude in R

is it possible to merge datasets in R by longitude and latitude?
I am seeking to merge two geocoded datasets by AidData, containing information on development projects all around the globe. I would like to connect projects which are within a radius of 50km of a project from the other dataset. Columns in the dataframes look something like this:
Dataframe place_name latitude longitude
World Bank Dar es Salaam -6.82349 39.26951
China Dar es Salaam Region -6.83522 39.19597
The most problematic issue is that a 50km radius is not equating to a specific, non-varying change in latitude or longitude everywhere. If that would be the case, the problem could be resolved rather easily by setting an upper and lower boundary for each project, and merging it with each other projects that falls within those boundaries.
Is it possible to merge these data by latitude and longitude in R at all?

How to impute missing latitude and longitude in dataset using R

I have a dataset with 1658099 observations
These are the variables and number of missing observations in each column
As part of preprocessing, how do I impute longitude and latitude values in it? I don't think mean of the location makes sense. And I don't want to discard them as well. Please help me with this. Thanks.

Nearest weather station to each zip code in large dataset?

I'm looking for an efficient way to link each record in a large dataset to its nearest NOAA weather station. The dataset contains 9-digit zip codes, and NOAA weather stations have lat long info. Anyone have tips on the best way to do this? Thanks!
EDIT: updating with code that worked in case anyone else is looking to find nearest NOAA weather station to a set of zip codes/ if there are suggestions for better ways to do this.
code based on that provided in this question: Finding nearest neighbour (log, lat), then the next closest neighbor, and so on for all points between two datasets in R
temp_stations is downloaded from https://www1.ncdc.noaa.gov/pub/data/normals/1981-2010/station-inventories/temp-inventory.txt (weather stations used in development of temperature dataset)
zipcodes is a package that contains a dataset with lat long for each zip code in the US.
install.packages("zipcode")
require(zipcode)
data(zipcode)
#prime.zips is a subset of "zipcode" created by selecting just the zip codes contained in my original dataset. running the code below on the whole zipcode dataset crashed R on my computer.
install.packages("geosphere")
require(geosphere)
mat <- distm(prime.zips[ ,c('longitude','latitude')], temp_stations[ ,c(3,2)], fun=distGeo)
# assign the weather station id to each record in prime.zips based on shortest distance in the matrix
prime.zips$nearest.station <- temp_stations$station.id[apply(mat, 1, which.min)]

Resources