CrossReferencing error; how can fix it in metboAnalystR? - r

I try to enrich my metabolite with MetboAnalystR and then when i want to Cross-reference list of compounds against libraries,
I faced with the error. This is my code:
tmp.vec <- c("L-Alanine", "Hexadecanoic acid", "L-Phenylalanine", "O-Propanoylcarnitine", "L-Methionine",
"L-Palmitoylcarnitine", "Triacylglycerol")
mSet<-InitDataObjects("conc", "msetora", FALSE)
mSet<-Setup.MapData(mSet, tmp.vec)
mSet<-CrossReferencing(mSet, "name")
In the last code, I get this error:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:21 --:--:-- 0
curl: (28) Failed to connect to www.metaboanalyst.ca port 443 after 21166 ms: Timed out
[1] "Download unsucceful. Ensure that curl is downloaded on your computer."
[1] "Attempting to re-try download using libcurl..."
trying URL 'https://www.metaboanalyst.ca/resources/libs/compound_db.qs'
Error in download.file(lib.url, destfile = filenm, method = "libcurl") :
cannot open URL 'https://www.metaboanalyst.ca/resources/libs/compound_db.qs'
In addition: Warning message:
In download.file(lib.url, destfile = filenm, method = "libcurl") :
URL 'https://www.metaboanalyst.ca/resources/libs/compound_db.qs': Timeout of 60 seconds was reached
what should I do for solving?
thank you for helping

Related

WordPress / WooCommerce update statement blocks entire MariaDB database on WP Multisite setup

I have a wordpress/woocommerce stack with a couple of plugins activated, once or twice a day MariaDB database (AWS RDS MariaDB 10.6) hangs and becomes completely unreachable to the point where I need to reboot the instance.
The error log shows too many connections errors and the connections indeed reach their max_connections value. I set up slow query logging to examine what happens when this occurs.
When the problem occured again, indeed the slow query log got written to the point where the RDS instance hanged again and needed a reboot.
It seems to be an update statement :
https://aws.amazon.com/rds/). started with:
Tcp port: 3306 Unix socket: /tmp/mysql.sock
Time Id Command Argument
# Time: 221108 14:00:54
# User#Host: wordpress[wordpress] # [10.0.0.191]
# Thread_id: 151359 Schema: wp_multisite_002 QC_hit: No
# Query_time: 50.027958 Lock_time: 0.050058 Rows_sent: 0 Rows_examined: 0
# Rows_affected: 0 Bytes_sent: 67
use wp_multisite_002;
SET timestamp=1667916054;
UPDATE `wp_2_options` SET `option_value` = '1668002404' WHERE `option_name` = '_transient_timeout_et_core_path';
# User#Host: wordpress[wordpress] # [10.0.0.191]
# Thread_id: 151362 Schema: wp_multisite_002 QC_hit: No
# Query_time: 50.053338 Lock_time: 0.050050 Rows_sent: 0 Rows_examined: 0
# Rows_affected: 0 Bytes_sent: 67
SET timestamp=1667916054;
UPDATE `wp_2_options` SET `option_value` = '1668002404' WHERE `option_name` = '_transient_timeout_et_core_path';
# Time: 221108 14:00:55
# User#Host: wordpress[wordpress] # [10.0.0.191]
# Thread_id: 151364 Schema: wp_multisite_002 QC_hit: No
# Query_time: 50.047369 Lock_time: 0.050090 Rows_sent: 0 Rows_examined: 0
# Rows_affected: 0 Bytes_sent: 67
SET timestamp=1667916055;
UPDATE `wp_2_options` SET `option_value` = '1668002405' WHERE `option_name` = '_transient_timeout_et_core_path';
# Time: 221108 14:00:58
# User#Host: wordpress[wordpress] # [10.0.0.191]
# Thread_id: 151371 Schema: wp_multisite_002 QC_hit: No
# Query_time: 50.001565 Lock_time: 0.050071 Rows_sent: 0 Rows_examined: 0
# Rows_affected: 0 Bytes_sent: 67
SET timestamp=1667916058;
Is there anyone who knows what could cause this and what's the use of this statement , google search on _transient_timeout_et_core_path doesn't give me much information.
I can't replicate the problem because I don't know what's the reason it occurs.
It does happen everyday, once or twice, also no excessive amounts of users , mostly 1 or 2 at the same time and I'm not even sure if anyone is actively using the site when it happens.
Thanks

Extracting RIFF data from both .wav and .flac files

Wave files can contain unofficial metadata, such as Sampler Chunk - "smpl":
https://sites.google.com/site/musicgapi/technical-documents/wav-file-format#smpl
These are used for audio looping players and samplers avoiding to loading multiple samples.
I have one such file here:
https://github.com/studiorack/basic-harmonica/blob/bf42d5bab7470cc201e3c4b6dee7925b19db6bff/samples/harmonica_1.wav
and a flac file converted using the official flac command line tool:
flac harmonica_1.wav --keep-foreign-metadata
https://github.com/studiorack/basic-harmonica/blob/main/samples/harmonica_1.flac
When running these tools I can confirm the metadata exists in each file:
https://hexfiend.com
However I do see a different in the number of bytes (I believe as flac has riff inserted in multiple places)
I can also convert the .flac file back to .wav and it is the same size, and contains the metadata: flac harmonica_1.flac --keep-foreign-metadata
When using other tools I can read the data:
$ sndfile-info har.wav
smpl : 60
Manufacturer : 0
Product : 0
Period : 20833 nsec
Midi Note : 64
Pitch Fract. : 0
SMPTE Format : 0
SMPTE Offset : 00:00:00 00
Loop Count : 1
Cue ID : 131072 Type : 0 Start : 12707 End : 47221 Fraction : 0 Count : 0
Sampler Data : 0
https://linux.die.net/man/1/sndfile-info
This only works for .wav files. There is a feature request for libsndfile to support 'smpl' in flac files:
https://github.com/libsndfile/libsndfile/issues/59
$ metaflac ./har.flac --list
smpl<aQ#�1u�METADATA block #7
type: 2 (APPLICATION)
is last: false
length: 20
application ID: 72696666
data contents:
https://xiph.org/flac
However as you can see the result returned are different. I would like a both .wav and .flac RIFF 'smpl' data to be returned in the same format, so I can verify the results match.
https://exiftool.org appears to be tool to do that. But it also produced inconsistent results between .wav and .flac:
$ exiftool -a -G1 -s ./har.wav
[ExifTool] ExifToolVersion : 12.42
[System] FileName : har.wav
[System] Directory : .
[System] FileSize : 95 kB
[System] FileModifyDate : 2022:10:11 21:16:37-07:00
[System] FileAccessDate : 2022:10:15 14:39:46-07:00
[System] FileInodeChangeDate : 2022:10:15 14:39:50-07:00
[System] FilePermissions : -rw-r--r--
[File] FileType : WAV
[File] FileTypeExtension : wav
[File] MIMEType : audio/x-wav
[RIFF] Encoding : Microsoft PCM
[RIFF] NumChannels : 1
[RIFF] SampleRate : 48000
[RIFF] AvgBytesPerSec : 96000
[RIFF] BitsPerSample : 16
[RIFF] Manufacturer : 0
[RIFF] Product : 0
[RIFF] SamplePeriod : 20833
[RIFF] MIDIUnityNote : 64
[RIFF] MIDIPitchFraction : 0
[RIFF] SMPTEFormat : none
[RIFF] SMPTEOffset : 00:00:00:00
[RIFF] NumSampleLoops : 1
[RIFF] SamplerDataLen : 0
[RIFF] SamplerData : (Binary data 20 bytes, use -b option to extract)
[RIFF] UnshiftedNote : 64
[RIFF] FineTune : 0
[RIFF] Gain : 0
[RIFF] LowNote : 0
[RIFF] HighNote : 127
[RIFF] LowVelocity : 0
[RIFF] HighVelocity : 127
[RIFF] Comment : Recorded on 7/10/2022 in Edison.
[RIFF] Software : FL Studio 20
[Composite] Duration : 0.99 s
and for flac
$ exiftool -a -G1 -s ./har.flac
[ExifTool] ExifToolVersion : 12.42
[System] FileName : har.flac
[System] Directory : .
[System] FileSize : 83 kB
[System] FileModifyDate : 2022:10:11 20:59:37-07:00
[System] FileAccessDate : 2022:10:15 14:44:12-07:00
[System] FileInodeChangeDate : 2022:10:15 14:42:26-07:00
[System] FilePermissions : -rw-r--r--
[File] FileType : FLAC
[File] FileTypeExtension : flac
[File] MIMEType : audio/flac
[FLAC] BlockSizeMin : 4096
[FLAC] BlockSizeMax : 4096
[FLAC] FrameSizeMin : 3442
[FLAC] FrameSizeMax : 6514
[FLAC] SampleRate : 48000
[FLAC] Channels : 1
[FLAC] BitsPerSample : 16
[FLAC] TotalSamples : 47222
[FLAC] MD5Signature : f89646c0d3056ec38c3e33ca79299253
[Vorbis] Vendor : reference libFLAC 1.4.1 20220922
[Composite] Duration : 0.98 s
How can I read this data consistently regardless of .flac or .wav file?
I was helped by the creator of exiftool here:
https://exiftool.org/forum/index.php?topic=14064.0
In short flac riff blocks were stored in a custom metadata format which exiftool could parse but needed a custom .ExifTool_config file
The creator added the necessary changes in a commit:
https://github.com/exiftool/exiftool/commit/5c2467fa6cdb38233793884e80cee9abf4da48e6#diff-0c24c6846e8207ad8d090e564fdc366dad6386f2ef7c51eea5aa0d72d970ff11
The latest release of ExifTool 12.49 now parses .wav and .flac loop data!
"Decode 'riff' metadata blocks in FLAC audio files"
https://exiftool.org/history.html
Usage:
exiftool ./har.wav
exiftool ./har.flac
Output:
Encoding : Microsoft PCM
Num Channels : 1
Sample Rate : 48000
Avg Bytes Per Sec : 96000
Bits Per Sample : 16
Manufacturer : 0
Product : 0
Sample Period : 20833
MIDI Unity Note : 64
MIDI Pitch Fraction : 0
SMPTE Format : none
SMPTE Offset : 00:00:00:00
Num Sample Loops : 1
Sampler Data Len : 0
Sampler Data : (Binary data 20 bytes, use -b option to extract)
Unshifted Note : 64
Fine Tune : 0
Gain : 0
Low Note : 0
High Note : 127
Low Velocity : 0
High Velocity : 127
Acidizer Flags : One shot
Root Note : High C
Beats : 2
Meter : 4/4
Tempo : 0
Comment : Recorded on 7/10/2022 in Edison.
Software : FL Studio 20
Duration : 0.87 s

Why is my code taking so long to return results?

When running this code I have to wait 10 seconds for s.Locations to print and 60+ seconds for n.Titles to print. What is causing this?
Tips on how to troubleshoot this would be helpful i.e. seeing how long it takes for certain lines of code to complete. New to Go so not sure how to exactly do this.
I've made sure I close my connections. Since everything else on my computer loads blazing fast I don't think to access the internet via http.Get should be slow.
package main
import (
"encoding/xml"
"fmt"
"io/ioutil"
"net/http"
"strings"
)
// SitemapIndex is the root xml
type SitemapIndex struct {
Locations []string `xml:"sitemap>loc"`
}
// News is the individual categories
type News struct {
Titles []string `xml:"url>news>title"`
Keywords []string `xml:"url>news>keywords"`
Locations []string `xml:"url>loc"`
}
// NewsMap is the
type NewsMap struct {
Keywords string
Location string
}
func main() {
var s SitemapIndex
var n News
// np := make(map[string]NewsMap)
resp, _ := http.Get("https://www.washingtonpost.com/news-sitemaps/index.xml")
bytes, _ := ioutil.ReadAll(resp.Body)
xml.Unmarshal(bytes, &s)
resp.Body.Close()
for i := range s.Locations {
s.Locations[i] = strings.TrimSpace(s.Locations[i])
}
fmt.Println(s.Locations) // slice of data
for _, Location := range s.Locations {
resp, _ := http.Get(Location)
bytes, _ := ioutil.ReadAll(resp.Body)
xml.Unmarshal(bytes, &n)
resp.Body.Close()
}
fmt.Println(n.Titles)
}
I get the output but I have to wait 10 seconds for s.Locations and 60+ seconds for n.Titles
Tips on how to troubleshoot this would be helpful.
Start with the simple things, measuring one thing at a time, a scientific experiment.
Use curl to measure basic response time.
$ curl https://www.google.com/robots.txt -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 7246 0 7246 0 0 94103 0 --:--:-- --:--:-- --:--:-- 94103
$ curl https://www.nytimes.com/robots.txt -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 934 100 934 0 0 61 0 0:00:15 0:00:15 --:--:-- 230
$ curl https://www.washingtonpost.com/robots.txt -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 3360 100 3360 0 0 133 0 0:00:25 0:00:25 --:--:-- 869
$
Google has no delay. The New York Times has a 15 second delay. The Washington Post has a 25 second delay.
In Go, confirm that The Washington Post has a 25 second delay.
$ go run wapo.go
25.174366651s
$ cat wapo.go
package main
import (
"fmt"
"io/ioutil"
"net/http"
"os"
"time"
)
func main() {
start := time.Now()
resp, err := http.Get("https://www.washingtonpost.com/news-sitemaps/index.xml")
if err != nil {
fmt.Fprintln(os.Stderr, err)
return
}
_, err = ioutil.ReadAll(resp.Body)
if err != nil {
fmt.Fprintln(os.Stderr, err)
return
}
resp.Body.Close()
fmt.Fprintln(os.Stderr, time.Since(start))
}
$
Next, try a different ISP from a different computer.
$ curl https://www.google.com/robots.txt -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 7246 0 7246 0 0 27343 0 --:--:-- --:--:-- --:--:-- 27343
$ curl https://www.nytimes.com/robots.txt -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 934 100 934 0 0 2017 0 --:--:-- --:--:-- --:--:-- 2017
$ curl https://www.washingtonpost.com/robots.txt -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 3360 100 3360 0 0 356 0 0:00:09 0:00:09 --:--:-- 840
$ curl https://www.washingtonpost.com/news-sitemaps/index.xml -o /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1101 100 1101 0 0 104 0 0:00:10 0:00:10 --:--:-- 266
$
$ go run wapo.go
8.378458882s
$
Google has no delay. The New York Times has a small delay. The Washington Post has a 9 second delay.
The Go code and compiler are the same:
$ go version
go version devel +a25c2878c7 Sat Jul 27 23:29:18 2019 +0000 linux/amd64
$ cat wapo.go
package main
import (
"fmt"
"net/http"
"os"
"time"
)
func main() {
start := time.Now()
resp, err := http.Get("https://www.washingtonpost.com/news-sitemaps/index.xml")
if err != nil {
fmt.Fprintln(os.Stderr, err)
return
}
resp.Body.Close()
fmt.Fprintln(os.Stderr, time.Since(start))
}
$
Therefore, focus on network and site factors.

How to set the right RCurl options to download from NSE website

I am trying to download files from the NSE India website (nseindia.com). The problem is that webmaster does not like scraping programs downloading files or reading pages from the website. They have a user agent based restriction it seems.
The file I am trying to download is http://www.nseindia.com/archives/equities/bhavcopy/pr/PR280815.zip
I am able to download this from the linux shell using
curl -v -A "Mozilla" http://www.nseindia.com/archives/equities/bhavcopy/pr/PR280815.zip
The output is this
About to connect() to www.nseindia.com port 80 (#0)
* Trying 115.112.4.12... % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:--
--:--:-- 0connected
GET /archives/equities/bhavcopy/pr/PR280815.zip HTTP/1.1
User-Agent: Mozilla
Host: www.nseindia.com
Accept: /
< HTTP/1.1 200 OK < Server: Oracle-iPlanet-Web-Server/7.0 < Content-Length: 374691 < X-frame-options: SAMEORIGIN < Last-Modified:
Fri, 28 Aug 2015 12:20:02 GMT < ETag: "5b7a3-55e051f2" <
Accept-Ranges: bytes < Content-Type: application/zip < Date: Sat, 29
Aug 2015 17:56:05 GMT < Connection: keep-alive < { [data not shown] PK
5 365k 5 19977 0 0 34013 0 0:00:11 --:--:-- 0:00:11
56592
This allows me to the download the file.
The code I am using in R Curl is this
library("RCurl")
jurl <- "http://www.nseindia.com/archives/equities/bhavcopy/pr/PR280815.zip"
juseragent <- "Mozilla"
myOpts = curlOptions(verbose = TRUE, header = TRUE, useragent = juseragent)
jfile <- getURL(jurl,.opts=myOpts)
This, too, does not work.
I have also unsuccessfully tried using download.file from the base library with the user agent changed.
Any help will be appreciated.
library(curl) # this is not RCurl, you need to download curl
to download file in the working directory
curl_download("http://www.nseindia.com/archives/equities/bhavcopy/pr/PR280815.zip","tt.zip",handle = new_handle("useragent" = "my_user_agent"))
First, your problem is not setting the user agent, but downloading binary data. This works:
jfile <- getURLContent(jurl, .opts=myOpts, binary=TRUE)
Here is a (more) complete example using httr instead of RCurl.
library(httr)
url <- "http://www.nseindia.com/archives/equities/bhavcopy/pr/PR280815.zip"
response <- GET(url, user_agent("Mozilla"))
response$status # 200 OK
# [1] 200
tf <- tempfile()
writeBin(content(response, "raw"), tf) # write response content (the zip file) to a temporary file
files <- unzip(tf, exdir=tempdir()) # unzips to system temp directory and returns a vector of file names
df.lst <- lapply(files[grepl("\\.csv$",files)],read.csv) # convert .csv files to list of data.frames
head(df.lst[[2]])
# SYMBOL SERIES SECURITY HIGH.LOW INDEX.FLAG
# 1 AGRODUTCH EQ AGRO DUTCH INDUSTRIES LTD H NA
# 2 ALLSEC EQ ALLSEC TECHNOLOGIES LTD H NA
# 3 ALPA BE ALPA LABORATORIES LTD H NA
# 4 AMTL EQ ADV METERING TECH LTD H NA
# 5 ANIKINDS BE ANIK INDUSTRIES LTD H NA
# 6 ARSHIYA EQ ARSHIYA LIMITED H NA

AIR NativeProcess does the job, but it sends progress data to ProgressEvent.STANDARD_ERROR_DATA

In Flex AIR application, I would like to upload file to ftp-server with NativeProcess API and curl.
Here is the simplified code:
protected function startProcess(event:MouseEvent):void
{
var processInfo:NativeProcessStartupInfo = new NativeProcessStartupInfo();
processInfo.executable = new File('/usr/bin/curl');
var processArgs:Vector.<String> = new Vector.<String>();
processArgs.push("-T");
processArgs.push("/Users/UserName/Desktop/001.mov");
processArgs.push("ftp://domainIp//www/site.com/");
processArgs.push("--user");
processArgs.push("username:password");
processInfo.arguments = processArgs;
var process:NativeProcess = new NativeProcess();
process.addEventListener(ProgressEvent.STANDARD_OUTPUT_DATA, outputDataHandler);
process.addEventListener(ProgressEvent.STANDARD_ERROR_DATA, errorOutputDataHandler);
process.start(processInfo);
}
It does the job well (i.e. target file is uploaded), but it emits ProgressEvent.STANDARD_ERROR_DATA instead of ProgressEvent.STANDARD_OUTPUT_DATA and all progress data goes to process.standardError.
protected function errorOutputDataHandler(event:ProgressEvent):void
{
var process = event.currentTarget as NativeProcess;
trace(process.standardError.readUTFBytes(process.standardError.bytesAvailable));
}
Here is an output of the latter function:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
1 15.8M 0 0 1 200k 0 166k 0:01:37 0:00:01 0:01:36 177k
2 15.8M 0 0 2 381k 0 143k 0:01:53 0:00:02 0:01:51 146k
...
What's wrong with my code? How can I debug it?
Thanks.
What you see is curl's progress meter. Try the -sS option to disable it but keep error messages.

Resources