I am new to R and I want to retrieve only the first email received from entire email threads. Each time an email is received in the inbox, a .csv file is created (i.e. I have many files of duplicated text, just differing by newest reply that is on the top of the .csv file)
I am not sure as to how I can provide my codes to help because right now, I don't have any idea how to even start on this segment of my data cleansing.
Is there a way for me to possibly group my .csv files based on the email thread they belong to, before extracting the most duplicated text (since the first email is likely to be present in all subsequent files) to be used in my corpus for topic mining?
Or does anyone have a better way to approach this? I have considered using the threads function in tm.plugin.mail but since these are plain texts, it only returns with a depth of 1 for every email.
EDIT:
The title of the files are just random strings of alphanumerals, and I only have a metadata containing the name of the sender, the date sent (no time provided), and the title of the file that it corresponds to. Here is an idea of what my data looks like:
From: xxx#gmail.com
To: yyy#gmail.com
Subject: Re: xxx
Dear Sir,
(main content)
Yours Sincerely,
xxx
From: yyy#gmail.com
To: xxx#gmail.com
Subject: xxx
Dear Sir,
(main content)
Regards,
yyy
Generally, this is how the .csv file will look like (except that with each comma, there is a new line break as per .csv files), so it is actually quite messy. There is no one fixed way that the email is being formatted, so I had a failed attempt in using regex to remove everything before the last instance of "From:". Some of the emails come in another format:
On (date), (time), (name) <email address> wrote:
Related
Within 'R' there is a neat new package named Microsoft365R - the only issue with this package I have personally had so far is trying to read in emails and pull off the attachments based on either system date, subject name or both of these factors.
It is possible to pull attachments using the package and to read emails and to dictate i.e. the 4th email etc. but if I was to receive a number of emails, I would want it to pull off this data based on system date and the subject name (as previously mentioned) i.e.:
I get permit data for SW KPI1 and SW KPI2 (named accordingly) on the 1st of the month - then I want the system to run and look for any emails that have SW KPI in the name that were sent on the same system-date and either download them to a set folder or to 'R' directly of which I can then transform and email out.
If anyone is aware of how to do this (either this package or another that is in version-ish), it'd be greatly appreciated.
I am trying to build a PowerApp to log setup times of our machines by our fitters.
This is what my app looks like:
There are buttons named "Uhrzeit". Pressing these will write the current date and time into the Date/Time fields. I am using the following code:
UpdateContext({Total8:(Text( Now(); "[$-de-DE]dd/mm/yyyy hh:mm:ss" ))})
The Date/Time field is named Total8.
The code is working well but after saving the form and opening a new record the old data is still available in the fields. By clicking on the button "Zeiten zurücksetzen" I can "delete" the old data.
UpdateContext({Total8:""})
Problem: When I open one of the older records the old data is not available in the form. There is only the value of the last record. In the Common Data Service where my records are saved the values are correct.
As an example, I am saving this record:
When I open a new record, the values of the record 1 are still available. This should not be the case if my app worked properly.
For your Information:
If I enter the date/time without tapping the button, saving the record and opening a new record I don't have the problem. I think the "UpdateContext" code is not the code I should use here.
Can anyone help me solve the problem?
I don't think there's a problem with using the contexts in this way -- but remember that a context is just a variable. It isn't automatically linked to a datasource in any special way - so if you set it equal to Now(), it's going to keep that value until you do something different.
When you view an old record, you need to get the data from CDS and update your contexts to match the CDS data. Does this make sense?
Yeah thats my problem.
I want the variable to be linked to a datasource. Or is it possible to write the date/time into the fields without using a context variable?
Can any body help me with the code how to call a web service in ms excel 2013 please? i was trying to use by using WebService function in excel but that is not working for me.
To learn how to use the Webservice function, we’ll do 2 things:
Use a =WEBSERVICE(url) function to get the data
Use the =FILTERXML(xml, xpath) function to extract a single piece of data from the XML string
Use a =WEBSERVICE(url) function to get the data
First, find a web service. For this example with weather updates, go to http://www.wunderground.com/weather/api to create your free account. Complete the form, then click Signup for API Key.
To set up your API Key, follow these steps:
Select either the Cumulus Plan or the Anvil Plan, whichever you prefer.
Choose whichever option you prefer for the History add-on. Either option will work for this example because we’re not using historical information.
Select Developer. Note: The other available options also will work for this example, but note that there is a fee associated with them.
Click Update Plan.
At the top of the page, click Documentation.
On the left navigation bar titled API Table of Contents, find the Data Features heading, then under that heading, click conditions. (You can also go to http://www.wunderground.com/weather/api/d/docs?d=data/conditions)
Scroll to the bottom of the page, then copy the URL shown in the box labeled Examples. (The URL format will look like this: http://api.wunderground.com/api/[APIKey]/conditions/q/CA/San_Francisco.json). The sample URL will include your unique API Key.
Now that you have a unique API Key, open your Excel spreadsheet and follow these steps to create the =WEBSERVICE(url) function for the current weather conditions:
In cell B5, enter =WEBSERVICE(url). Then replace url with the unique URL including your API Key that you copied a moment ago.
Add quotation marks to both sides of the URL. The format will look like this: “http://api.wunderground.com/api/[APIKey]/conditions/q/CA/San_Francisco.json”
Replace the state and city in the URL with a zip code, then add .xml to the end of the URL. The formula in cell B5 should look like this: =WEBSERVICE(“http://api.wunderground.com/api/[APIKey]/conditions/[ZipCode].xml”) The[APIKey] will be your unique API Key, and the [Zip Code] will be for the location where you want weather updates.
Press Enter or Return. The formula will return an XML string from the web service.
You can also use cell references in the Webservice function to update URL parameters, such as your zip code. Here is how to set it up:
In cell B1, paste your API Key. In the Name Box, type APIkey to name the cell.
In cell B2, enter the zip code. In the Name Box, type ZipCode to name the cell.
Create your WEBSERVICE function with cell references. The formula should be in this format: =WEBSERVICE(“http://api.wunderground.com/api/” & APIkey & “/conditions/q/” & ZipCode & “.xml”)
Copy and paste the entire formula into cell B5.
Update your zip code and then you will see the update to your WEBSERVICE Function URL.
Use the =FILTERXML(xml, xpath) function to extract single pieces of data from the XML string
Now that we have the information from the web service in the Excel spreadsheet, we need to extract the pieces of data we want out of the XML, including the name of the city and current temperature and current weather conditions. To extract the data, follow these steps:
In cell B8, enter the =FILTERXML(B5,”//full”) function. This will give you the city name associated with the zip code.
In cell C8, enter =FILTERXML(B5, “//temp_f”) to extract the current temperature in Fahrenheit.
In cell D8, enter =FILTERXML(B5, “//weather”) to see the current weather condition, such as Light Rain.
With the online weather updates, now our camping trip planning collaboration spreadsheet looks like this:
A note on refreshing data
Please note that WEBSERVICE Functions are “non-volatile”, which means they refresh only when:
A referenced cell is edited
The entire workbook is refreshed (CTRL+ ALT + F9)
Remember that you can use this functionality for many different web services over the internet that you can then analyze using Excel.
I have a requirement to load the csv into DB using oracle apex or pl/sql code, but the problem is they are asking to load the csv file which will not come with same number of columns and column names .
I should create table & upload data dynamically based on the file name and data that i'm uploading.
For every file i need to create a new table dynamically and insert data that are present in csv file.
For Example:
File1:
col1 col2 col3 col4 (NOTE: If i upload File 1, Table should be created dynamically based on the file name and table should contain same column name and data same as column headers of csv file . )
file 2:
col1 col2 col3 col4 col 5
file 3:
col4 col2 col1 col3
Depending on the columns and file name i need to create table for every file upload.
Can we load like this or not?
If yes, Please help me on this.
Regards,
Sachin.
((Where's the PL/SQL code in this solution!!??! Bear with me... the
answer is buried in here somewhere... I introduced some considerations
and assumptions you will need to think about before going into the
task. In the end, you'll find that Oracle APEX actually has a
built-in solution that satisfies exactly what you've specified... with
some caveats.))
If you are working within the Oracle APEX platform, you will have some advantages. APEX Version 4.2 and higher has a new page element called "Data Loading". The disadvantage however is that the definition of the upload target is fixed and not dynamic. You will need to know how your table is structured prior to loading the data.
One approach to overcome this is to build a generic, two-column table as your target, which will serve for all uploads. Column 1 will be your file-name and column two will be a single clob data type, which will contain the entire data file's contents including the header row. The "Data Loading" element will give the user the opportunity to verify and select this mapping convention in a couple of clicks.
At this point, it's mostly PL/SQL backend work doing the heavy lifting to parse and transform the data uploaded. As far as the dynamic table creation, I have noticed that the Oracle package, DBMS_SQL allows the execution of DDL SQL commands, which could be the route to making custom tables.
Alex Poole's comment is important as well, you will need to make some blanket assumption about the data type or have a provision to give more clues about what kind of data is contained. Assuming you can rely on a sample of existing data values is not good... what if all the values in your upload are null? I recommend perhaps a second column in the data input with a clue about the type of data for each column... just like the intended header names, maybe: AAAAA = for a five character column, # = for a numeric, MM/DD/YYYY = for a date with a specific masking.
The easier route:
You will need to allow your end-user access to a developer-role account on a workspace of your APEX server. It is not as scary as you think. With careful instruction and some simple precautions, I have been able to make this work with even the most non-technical of users. The reason for this is that there is a more powerful upload tool found under the following menu item:
SQL Workshop --> Utilities --> Data Workshop
There is a choice under "Data Load" --> "Spreadsheet Data"
The data load tool will automatically do the following:
Accept a CSV formatted file through a browse function on your client machine
Upload the file and parse the first record for the column layout (names)
Allow the user to create a new table from the uploaded file, or to map to an existing one.
For new tables, each column data type can be declared and also a specific numeric/date mask if additional conversion from the uploaded data is necessary.
Delimiter type, optional enclosures (like double quotes), decimal conventions and currency types can also be declared prior to parsing the uploaded file.
Once the user has identified all these mappings and settings, the table is created with the uploaded data. Any errors in record upload are reported immediately afterwards with detailed feedback on the failed records.
A security consideration to note:
You probably do not want to give end users access to your APEX server's backend... but you CAN create a new workspace... just for your end users... create a new database schema for receiving their uploads, maybe with some careful resource controls. Developer is the minimum role needed... but even if the end users see the other stuff there won't be access to anything important from an isolated workspace.
I have implemented the isolated workspace approach on a 4.0/4.1 release APEX platform a few years back, and it worked nicely. Our end user had control over the staging and quality checking of her data inputs (from excel spreadsheet/csv exports collected from a combination of sources). I suppose it may have been even better to cut her out of the picture entirely and focused on automating the export-review-upload process between our database and her other sources. In this case, the volume of data involved was not great enough (100's to 1000's of records) and the need for manual review and edit of the exported data was very important prior to pushing it into the database... so the human element was still important in this case - it is something you'll want to think about now.
What is the best practice to upload bunch of data(multiple rows) at once time. I don't want to upload any files on the server.
Is it good to have a text area to input data with a predefined structure(format). And create a small parser to read and analyze that input to insert it into to the database.
Edit:
I have the data set in excel file. I want to store it in the database, I don't want to upload the server.
Data sample :
id fid sid name
--------------------------------------------------------------
1- 3a3458 2a2125 3a4541 John Smith
2- 313547 3a4541 212145 Albert koku
.....................
...............
.........
100- ...
Since the data has already a predefined structure, to avoid manual errors and parsing I would build an interface which contains a <table> and the user will fill the corresponding data and then submit it to the server.
But if you don't want to bother to guide the user and help him then you could of course use a <textarea> in which the user could enter the data under some known form: CSV, JSON, XML, ... and then do the parsing on the server.