Nexus Start up really slow - nexus

We're running Sonatype Nexus 3 on an Odroid C1+, about equivalent to a RPi 2. Yes, not recommended, but for our development team of 2 performance is acceptable.
Except for Start-up. The Nexus 3 OSS Server takes an hour or more to be available.
Is that normal?
Any ideas why is is so slow?
Here are some of the log entries from a start-up.
2017-05-16 05:36:29,185+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.server.OServer - $ANSI{green:italic OrientDB Server is active} v2.2.13.
2017-05-16 05:36:29,189+0000 INFO [FelixStartLevel] *SYSTEM org.sonatype.nexus.internal.orient.DatabaseServerImpl - Activated
2017-05-16 05:36:29,242+0000 INFO [FelixStartLevel] *SYSTEM org.sonatype.nexus.extender.NexusLifecycleManager - Start UPGRADE
2017-05-16 05:36:35,919+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} Storage 'config' was not closed properly. Will try to recover from write ahead log
2017-05-16 05:36:35,931+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} Looking for last checkpoint...
2017-05-16 05:36:36,515+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} Found FUZZY checkpoint.
2017-05-16 05:36:36,530+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} Data restore procedure from FUZZY checkpoint is started.
2017-05-16 05:36:36,562+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} Record com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OFuzzyCheckpointEndRecord{lsn=OLogSequenceNumber{segment=0, position=6856901}} will be skipped during data restore
2017-05-16 05:36:36,570+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} 1 operations were processed, current LSN is OLogSequenceNumber{segment=0, position=6856901} last LSN is OLogSequenceNumber{segment=0, position=6856954}
2017-05-16 05:36:36,579+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} Record OFuzzyCheckpointStartRecord{lsn=OLogSequenceNumber{segment=0, position=6856908}} com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OFuzzyCheckpointStartRecord{lsn=null, previousCheckpoint=OLogSequenceNumber{segment=0, position=6856861}} will be skipped during data restore
2017-05-16 05:36:36,822+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} Record com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OFuzzyCheckpointEndRecord{lsn=OLogSequenceNumber{segment=0, position=6856948}} will be skipped during data restore
2017-05-16 05:36:36,829+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} Storage data recover was completed
2017-05-16 05:42:54,800+0000 INFO [FelixStartLevel] *SYSTEM org.sonatype.nexus.extender.NexusLifecycleManager - Start SCHEMAS
2017-05-16 05:42:55,883+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=analytics}} Storage 'analytics' was not closed properly. Will try to recover from write ahead log
2017-05-16 05:42:55,894+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=analytics}} Looking for last checkpoint...
2017-05-16 05:42:55,904+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=analytics}} Found FUZZY checkpoint.
2017-05-16 05:42:55,913+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=analytics}} Data restore procedure from FUZZY checkpoint is started.
2017-05-16 05:42:55,921+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=analytics}} Record com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OFuzzyCheckpointEndRecord{lsn=OLogSequenceNumber{segment=0, position=2246069}} will be skipped during data restore
2017-05-16 05:42:55,929+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=analytics}} 1 operations were processed, current LSN is OLogSequenceNumber{segment=0, position=2246069} last LSN is OLogSequenceNumber{segment=0, position=2246122}
2017-05-16 05:42:55,938+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=analytics}} Record OFuzzyCheckpointStartRecord{lsn=OLogSequenceNumber{segment=0, position=2246076}} com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OFuzzyCheckpointStartRecord{lsn=null, previousCheckpoint=OLogSequenceNumber{segment=0, position=2246029}} will be skipped during data restore
2017-05-16 05:42:55,946+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=analytics}} Record com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OFuzzyCheckpointEndRecord{lsn=OLogSequenceNumber{segment=0, position=2246116}} will be skipped during data restore
2017-05-16 05:42:55,952+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=analytics}} Storage data recover was completed
2017-05-16 05:49:05,078+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=audit}} Storage 'audit' was not closed properly. Will try to recover from write ahead log
2017-05-16 05:49:05,089+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=audit}} Looking for last checkpoint...
2017-05-16 05:49:05,097+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=audit}} Found FUZZY checkpoint.
2017-05-16 05:49:05,105+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=audit}} Data restore procedure from FUZZY checkpoint is started.
2017-05-16 05:49:05,113+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=audit}} Record com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OFuzzyCheckpointEndRecord{lsn=OLogSequenceNumber{segment=0, position=2268394}} will be skipped during data restore
2017-05-16 05:49:05,121+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=audit}} 1 operations were processed, current LSN is OLogSequenceNumber{segment=0, position=2268394} last LSN is OLogSequenceNumber{segment=0, position=2268447}
2017-05-16 05:49:05,129+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=audit}} Record OFuzzyCheckpointStartRecord{lsn=OLogSequenceNumber{segment=0, position=2268401}} com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OFuzzyCheckpointStartRecord{lsn=null, previousCheckpoint=OLogSequenceNumber{segment=0, position=2268354}} will be skipped during data restore
2017-05-16 05:49:05,138+0000 WARN [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=audit}} Record com.orientechnologies.orient.core.storage.impl.local.paginated.wal.OFuzzyCheckpointEndRecord{lsn=OLogSequenceNumber{segment=0, position=2268441}} will be skipped during data restore
2017-05-16 05:49:05,144+0000 INFO [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=audit}} Storage data recover was completed

A few notes:
This is cool! Someone on our internal team got it working on a Raspberry Pi 3 a while ago, so we love seeing stuff like this
We don't support this configuration so any help from us is kinda on a "wow this is cool" sort of level
Our internal dude said he noticed that NXRM 3 runs out of memory while updating schemas during boot on the Raspberry Pi 3, and he's yet to find a workaround.
He found a workaround. He's got a great big beard, too. He edited nexus.vmoptions to have the following:
-Xms256M
-Xmx256M
-XX:MaxDirectMemorySize=512M
Orient uses system memory, which is distinct from Nexus Repo which uses the Java Heap, if that helps you at all. The note above about Orient and shutdown is also probably very wildly relevant.
Here's some info we put together about Orient and memory, and tuning:
Optimizing OrientDB Database Memory
Also here is a more generic article on system requirements in regards to Nexus Repository Manager 3:
Nexus Repository Manager 3 System Requirements
What I would suggest for your team is wildly different, potentially use our Docker image and spin Nexus Repo up on whatever hardware you've got to use.
Docker Nexus3
Best luck either way!

Related

Failed upgraded from 3.41.0 to 3.41.1 (3.42.0)

The nexus is located in kubernetis, before that it was updated without problems. At the moment I get the following error, no matter which version I upgrade to 3.41.1/3.42.0.
Log output from version 3.41.1:
-------------------------------------------------
Started Sonatype Nexus OSS 3.41.1-01
-------------------------------------------------
2022-10-19 11:34:30,682+0000 INFO [jetty-main-1] *SYSTEM org.eclipse.jetty.server.AbstractConnector - Started ServerConnector#31451f5e{HTTP/1.1, (http/1.1)}{0.0.0.0:8086}
2022-10-19 11:34:30,686+0000 INFO [jetty-main-1] *SYSTEM org.eclipse.jetty.server.AbstractConnector - Started ServerConnector#2105be54{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}
2022-10-19 11:34:30,689+0000 INFO [jetty-main-1] *SYSTEM org.eclipse.jetty.server.AbstractConnector - Started ServerConnector#5f41ab0d{HTTP/1.1, (http/1.1)}{0.0.0.0:8085}
2022-10-19 11:34:31,834+0000 WARN [Timer-0] *SYSTEM java.util.prefs - Could not lock User prefs. Unix error code 2.
2022-10-19 11:34:31,835+0000 WARN [Timer-0] *SYSTEM java.util.prefs - Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
2022-10-19 11:34:48,228+0000 INFO [qtp491323871-560] *UNKNOWN org.apache.shiro.session.mgt.AbstractValidatingSessionManager - Enabling session validation scheduler...
2022-10-19 11:34:48,238+0000 INFO [qtp491323871-556] *UNKNOWN org.sonatype.nexus.internal.security.anonymous.AnonymousManagerImpl - Loaded configuration: OrientAnonymousConfiguration{enabled=true, userId='anonymous', realmName='NexusAuthorizingRealm'}
2022-10-19 11:35:01,488+0000 WARN [Timer-0] *SYSTEM java.util.prefs - Could not lock User prefs. Unix error code 2.
2022-10-19 11:35:01,489+0000 WARN [Timer-0] *SYSTEM java.util.prefs - Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
2022-10-19 11:35:31,489+0000 WARN [Timer-0] *SYSTEM java.util.prefs - Could not lock User prefs. Unix error code 2.
2022-10-19 11:35:31,489+0000 WARN [Timer-0] *SYSTEM java.util.prefs - Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
2022-10-19 11:36:01,489+0000 WARN [Timer-0] *SYSTEM java.util.prefs - Could not lock User prefs. Unix error code 2.
2022-10-19 11:36:01,490+0000 WARN [Timer-0] *SYSTEM java.util.prefs - Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
2022-10-19 11:36:18,170+0000 WARN [SIGTERM handler] *SYSTEM com.orientechnologies.orient.core.OSignalHandler - Received signal: SIGTERM
I get a similar error when updating to 3.42.0.
What could be the problem?

Openstack fails to launch new instances with status ERROR

I was following the tutorials on Openstack (Stein) docs website to launch an instance on my provider network. I am using networking option 2. I run the following command to create the instance, replacing PROVIDER_NET_ID with my provider network interface id.
openstack server create --flavor m1.nano --image cirros \
--nic net-id=PROVIDER_NET_ID --security-group default \
--key-name mykey provider-instance1
I run openstack server list to check the status of my instance. It shows a status of ERROR.
I checked the /var/log/nova/nova-compute.log on my compute node (I only have one compute node) and came across the following error.
ERROR nova.compute.manager [req-0a1a6ddf-edc7-4987-a018-32af7b6a29b6
995cce48094442b4b29f3fb665219408 429965cc45a743a5b00f3a1bd098e1ab - default default]
[instance: 19a6c859-5dde-4ed3-9010-4d93ebe9a942] Instance failed to spawn: PermissionError:
[Errno 13] Permission denied: '/var/lib/nova/instances/19a6c859-5dde-4ed3-9010-4d93ebe9a942'
However, logs before this error in the log file seem like everything was okay before this error occured.
2022-05-23 12:40:21.011 9627 INFO nova.compute.claims
[req-0a1a6ddf-edc7-4987-a018-32af7b6a29b6 995cce48094442b4b29f3fb665219408
429965cc45a743a5b00f3a1bd098e1ab - default default]
[instance: 19a6c859-5dde-4ed3-9010-4d93ebe9a942]
Attempting claim on node compute1: memory 64 MB, disk 1 GB, vcpus 1 CPU
2022-05-23 12:40:21.019 9627 INFO nova.compute.claims
[req-0a1a6ddf-edc7-4987-a018-32af7b6a29b6 995cce48094442b4b29f3fb665219408
429965cc45a743a5b00f3a1bd098e1ab - default default]
[instance: 19a6c859-5dde-4ed3-9010-4d93ebe9a942] Total memory: 3943 MB, used: 512.00 MB
2022-05-23 12:40:21.020 9627 INFO nova.compute.claims
[req-0a1a6ddf-edc7-4987-a018-32af7b6a29b6 995cce48094442b4b29f3fb665219408
429965cc45a743a5b00f3a1bd098e1ab - default default]
[instance: 19a6c859-5dde-4ed3-9010-4d93ebe9a942] memory limit not specified, defaulting to unlimited
2022-05-23 12:40:21.020 9627 INFO nova.compute.claims
[req-0a1a6ddf-edc7-4987-a018-32af7b6a29b6 995cce48094442b4b29f3fb665219408
429965cc45a743a5b00f3a1bd098e1ab - default default]
[instance: 19a6c859-5dde-4ed3-9010-4d93ebe9a942] Total disk: 28 GB, used: 0.00 GB
2022-05-23 12:40:21.021 9627 INFO nova.compute.claims
[req-0a1a6ddf-edc7-4987-a018-32af7b6a29b6 995cce48094442b4b29f3fb665219408
429965cc45a743a5b00f3a1bd098e1ab - default default]
[instance: 19a6c859-5dde-4ed3-9010-4d93ebe9a942] disk limit not specified, defaulting to unlimited
2022-05-23 12:40:21.024 9627 INFO nova.compute.claims
[req-0a1a6ddf-edc7-4987-a018-32af7b6a29b6 995cce48094442b4b29f3fb665219408
429965cc45a743a5b00f3a1bd098e1ab - default default]
[instance: 19a6c859-5dde-4ed3-9010-4d93ebe9a942] Total vcpu: 4 VCPU, used: 0.00 VCPU
2022-05-23 12:40:21.025 9627 INFO nova.compute.claims
[req-0a1a6ddf-edc7-4987-a018-32af7b6a29b6 995cce48094442b4b29f3fb665219408
429965cc45a743a5b00f3a1bd098e1ab - default default]
[instance: 19a6c859-5dde-4ed3-9010-4d93ebe9a942] vcpu limit not specified, defaulting to unlimited
2022-05-23 12:40:21.028 9627 INFO nova.compute.claims
[req-0a1a6ddf-edc7-4987-a018-32af7b6a29b6 995cce48094442b4b29f3fb665219408
429965cc45a743a5b00f3a1bd098e1ab - default default]
[instance: 19a6c859-5dde-4ed3-9010-4d93ebe9a942] Claim successful on node compute1
Anyone have any ideas one what I may be doing wrong?
I'll be thankful.

STORE relation problem using pig -x local problem, failed to read data

1st approach: Using pig -x mapreduce
Hbase table created via hbase shell
Hbase table is created:
hbase(main):003:0> list
TABLE
clientes
1 row(s)
Took 0.0047 seconds
=> ["clientes"]
Used this code to Load data from clientes.txt into dados (pig -x mapreduce)
grunt> dados = LOAD 'file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt' USING PigStorage(',') AS (
id:chararray,
nome:chararray,
sobrenome:chararray,
idade:int,
funcao:chararray
);
Checked dados with dump dados and failed:
2021-03-07 19:00:32,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1615152557282_0002
2021-03-07 19:00:32,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:00:32,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:00:32,395 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:00:37,406 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:00:37,406 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1615152557282_0002 has failed! Stop running all dependent jobs
2021-03-07 19:00:37,406 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:00:37,410 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2021-03-07 19:00:37,492 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Could not get Job info from RM for job job_1615152557282_0002. Redirecting to job history server.
2021-03-07 19:00:37,595 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
2021-03-07 19:00:37,595 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:00:37,597 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:00:31 2021-03-07 19:00:37 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_1615152557282_0002 dados MAP_ONLY Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Wrong FS: hdfs://localhost:9000/user/hadoop, expected: file:///
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:294)
at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1565)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1562)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1562)
at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.pig.backend.hadoop.PigJobControl.submit(PigJobControl.java:128)
at org.apache.pig.backend.hadoop.PigJobControl.run(PigJobControl.java:205)
at java.lang.Thread.run(Thread.java:748)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:301)
Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:9000/user/hadoop, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:737)
at org.apache.hadoop.fs.RawLocalFileSystem.setWorkingDirectory(RawLocalFileSystem.java:604)
at org.apache.hadoop.fs.FilterFileSystem.setWorkingDirectory(FilterFileSystem.java:307)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:250)
... 18 more
hdfs://localhost:9000/tmp/temp-1169299097/tmp-2103156722,
Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Failed to produce result in "hdfs://localhost:9000/tmp/temp-1169299097/tmp-2103156722"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_1615152557282_0002
2021-03-07 19:00:37,597 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2021-03-07 19:00:37,601 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias dados. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
Details at logfile: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154395936.log
2nd approach: Using pig -x local (dump dados works)
grunt> dados = LOAD 'file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt' USING PigStorage(',') AS (
>> id:chararray,
>> nome:chararray,
>> sobrenome:chararray,
>> idade:int,
>> funcao:chararray
>> );
2021-03-07 19:02:17,219 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2021-03-07 19:02:17,222 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt:0+794
2021-03-07 19:02:17,226 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm version is 2
2021-03-07 19:02:17,226 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2021-03-07 19:02:17,241 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 699400192 to monitor. collectionUsageThreshold = 489580128, usageThreshold = 489580128
2021-03-07 19:02:17,243 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2021-03-07 19:02:17,253 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:02:17,266 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner -
2021-03-07 19:02:17,274 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task:attempt_local116575577_0001_m_000000_0 is done. And is in the process of committing
2021-03-07 19:02:17,280 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner -
2021-03-07 19:02:17,280 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task attempt_local116575577_0001_m_000000_0 is allowed to commit now
2021-03-07 19:02:17,285 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local116575577_0001_m_000000_0' to file:/tmp/temp2133275539/tmp1539690224
2021-03-07 19:02:17,286 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - map
2021-03-07 19:02:17,286 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task 'attempt_local116575577_0001_m_000000_0' done.
2021-03-07 19:02:17,291 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Final Counters for attempt_local116575577_0001_m_000000_0: Counters: 16
File System Counters
FILE: Number of bytes read=1264
FILE: Number of bytes written=530456
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=20
Map output records=20
Input split bytes=414
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=0
Total committed heap usage (bytes)=311427072
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
org.apache.pig.PigWarning
FIELD_DISCARDED_TYPE_CONVERSION_FAILED=1
2021-03-07 19:02:17,291 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local116575577_0001_m_000000_0
2021-03-07 19:02:17,291 [Thread-7] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2021-03-07 19:02:17,485 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,492 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,492 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2021-03-07 19:02:17,492 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2021-03-07 19:02:17,493 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,536 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:02:17,540 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:02:16 2021-03-07 19:02:17 UNKNOWN
Success!
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_local116575577_0001 1 0 n/a n/a n/a n/a 0 0 0 0 dados MAP_ONLY file:/tmp/temp2133275539/tmp1539690224,
Input(s):
Successfully read 20 records from: "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Successfully stored 20 records in: "file:/tmp/temp2133275539/tmp1539690224"
Counters:
Total records written : 20
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local116575577_0001
2021-03-07 19:02:17,542 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,544 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,551 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,558 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1 time(s).
2021-03-07 19:02:17,558 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2021-03-07 19:02:17,563 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2021-03-07 19:02:17,563 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2021-03-07 19:02:17,570 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input files to process : 1
2021-03-07 19:02:17,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(id,nome,sobrenome,,funcao)
(c001,Josias,Silva,55,Analista de Mercado)
(1100002,Pedro,Malan,74,Professor)
(1100003,Maria,Maciel,34,Bombeiro)
(1100004,Suzana,Bustamante,66,Analista de TI)
(1100005,Karen,Moreira,74,Advogado)
(1100006,Patricio,Teixeira,42,Veterinario)
(1100007,Elisa,Haniero,43,Piloto)
(1100008,Mauro,Bender,63,Marceneiro)
(1100009,Mauricio,Wagner,39,Artista)
(1100010,Douglas,Macedo,60,Escritor)
(1100011,Francisco,McNamara,47,Cientista de Dados)
(1100012,Sidney,Raynor,26,Escritor)
(1100013,Maria,Moon,41,Gerente de Projetos)
(1100014,Bete,Balanaira,65,Musico)
(1100015,Julia,Peixoto,49,Especialista em TI)
(1100016,Jeronimo,Wallace,52,Engenheiro de Dados)
(1100017,Noeli,Laura,72,Cientista de Dados)
(1100018,Jean,Junior,45,Desenvolvedor RPA)
(1100019,Cristina,Garbim,63,Engenheiro Blockchain)
But STORE dados INTO 'hbase://clientes' or STORE dados INTO 'file:///home/hadoop/hadloop/pig_output' fails:
grunt> STORE dados INTO 'hbase://clientes' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'dados_clientes:nome
>> dados_clientes:sobrenome
>> dados_clientes:idade
>> dados_clientes:funcao'
>> );
2021-03-07 19:03:51,347 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1289080477_0002
2021-03-07 19:03:51,347 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:03:51,347 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:03:51,349 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:03:51,349 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_local1289080477_0002]
2021-03-07 19:03:51,835 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for clientes
2021-03-07 19:03:51,839 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 699400192 to monitor. collectionUsageThreshold = 489580128, usageThreshold = 489580128
2021-03-07 19:03:51,839 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2021-03-07 19:03:51,843 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:03:51,860 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation - Closing zookeeper sessionid=0x1780e985b4d000f
2021-03-07 19:03:51,866 [LocalJobRunner Map Task Executor #0] INFO org.apache.zookeeper.ZooKeeper - Session: 0x1780e985b4d000f closed
2021-03-07 19:03:51,866 [LocalJobRunner Map Task Executor #0-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x1780e985b4d000f
2021-03-07 19:03:51,867 [Thread-10] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2021-03-07 19:03:51,870 [Thread-10] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1289080477_0002
java.lang.Exception: java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:83)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:144)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:670)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:992)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:75)
... 18 more
2021-03-07 19:03:52,055 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:03:52,055 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1289080477_0002 has failed! Stop running all dependent jobs
2021-03-07 19:03:52,055 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:03:52,056 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:03:52,057 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:03:52,057 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:03:52,058 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:03:50 2021-03-07 19:03:52 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local1289080477_0002 dados MAP_ONLY Message: Job failed! hbase://clientes,
Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Failed to produce result in "hbase://clientes"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local1289080477_0002
2021-03-07 19:03:52,058 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
grunt> STORE dados INTO 'file:///home/hadoop/hadloop/pig_output' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'dados_clientes:nome
>> dados_clientes:sobrenome
>> dados_clientes:idade
>> dados_clientes:funcao'
>> );
java.lang.Exception: java.lang.IllegalArgumentException: Illegal character code:47, </> at 0. User-space table qualifiers can only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: ///home/hadoop/hadloop/pig_output
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.lang.IllegalArgumentException: Illegal character code:47, </> at 0. User-space table qualifiers can only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: ///home/hadoop/hadloop/pig_output
at org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:196)
at org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:149)
at org.apache.hadoop.hbase.TableName.<init>(TableName.java:322)
at org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:358)
at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:449)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.<init>(TableOutputFormat.java:107)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.getRecordWriter(TableOutputFormat.java:153)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:83)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:659)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:779)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2021-03-07 19:05:10,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1458581109_0003
2021-03-07 19:05:10,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:05:10,476 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C: R:
2021-03-07 19:05:10,477 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:05:10,477 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:05:10,477 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1458581109_0003 has failed! Stop running all dependent jobs
2021-03-07 19:05:10,478 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:05:10,478 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:05:10,479 [main] WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:05:10,480 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:05:10,480 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
3.2.2 0.17.0 hadoop 2021-03-07 19:05:10 2021-03-07 19:05:10 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_local1458581109_0003 dados MAP_ONLY Message: Job failed! file:///home/hadoop/hadloop/pig_output,
Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"
Output(s):
Failed to produce result in "file:///home/hadoop/hadloop/pig_output"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local1458581109_0003
2021-03-07 19:05:10,480 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
Services running:
(base) [hadoop#dataserver 1-HBase]$ jps
4160 SecondaryNameNode
11666 Main
5413 HQuorumPeer
5766 HRegionServer
6966 JobHistoryServer
4631 NodeManager
4457 ResourceManager
5578 HMaster
3835 DataNode
12382 Jps
3615 NameNode
Hadoop version:
SUBCOMMAND may print help when invoked w/o parameters or with -h.
(base) [hadoop#dataserver 1-HBase]$ hadoop version
Hadoop 3.2.2
Source code repository Unknown -r 7a3bc90b05f257c8ace2f76d74264906f0f7a932
Compiled by hexiaoqiao on 2021-01-03T09:26Z
Compiled with protoc 2.5.0
From source with checksum 5a8f564f46624254b27f6a33126ff4
This command was run using /opt/hadoop/share/hadoop/common/hadoop-common-3.2.2.jar
HBase version:
(base) [hadoop#dataserver 1-HBase]$ hbase version
/opt/hadoop/libexec/hadoop-functions.sh: line 2366: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_USER: bad substitution
/opt/hadoop/libexec/hadoop-functions.sh: line 2461: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_OPTS: bad substitution
Error: Could not find or load main class org.apache.hadoop.hbase.util.GetJavaProperty
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase 2.2.0
Source code repository file:///opt/hbase-rm/output/hbase-2.2.0-bin revision=Unknown
Compiled by hbase-rm on Tue Jun 11 04:30:30 UTC 2019
From source with checksum 63a465554927aeea3f1f0bcae63decff
Pig Version:
(base) [hadoop#dataserver 1-HBase]$ pig version
2021-03-07 19:08:50,197 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
2021-03-07 19:08:50,199 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
2021-03-07 19:08:50,199 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2021-03-07 19:08:50,263 [main] INFO org.apache.pig.Main - Apache Pig version 0.17.0 (r1797386) compiled Jun 02 2017, 15:41:58
2021-03-07 19:08:50,263 [main] INFO org.apache.pig.Main - Logging error messages to: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154930258.log
2021-03-07 19:08:50,536 [main] ERROR org.apache.pig.Main - ERROR 2997: Encountered IOException. File version does not exist
Details at logfile: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154930258.log
2021-03-07 19:08:50,557 [main] INFO org.apache.pig.Main - Pig script completed in 400 milliseconds (400 ms)
To solve this issue you need to start a service from Yarn called Job History Server
Run this following command:
mr-jobhistory-daemon.sh start historyserver
and check if the following service is working fine through jps command:
13153 HQuorumPeer
13314 HMaster
**20242 JobHistoryServer**
5043 NameNode
6003 NodeManager
30163 Jps
5845 ResourceManager
5514 SecondaryNameNode
5227 DataNode
28510 RunJar
13519 HRegionServer

Airlfow Execution Timeout not working well

I've set 'execution_timeout': timedelta(seconds=300) parameter on many tasks. When the execution timeout is set on task downloading data from Google Analytics it works properly - after ~300 seconds is the task set to failed. The task downloads some data from API (python), then it does some transformations (python) and loads data into PostgreSQL.
Then I've a task which executes only one PostgreSQL function - execution sometimes takes more than 300 seconds but I get this (task is marked as finished successfully).
*** Reading local file: /home/airflow/airflow/logs/bulk_replication_p2p_realtime/t1/2020-07-20T00:05:00+00:00/1.log
[2020-07-20 05:05:35,040] {__init__.py:1139} INFO - Dependencies all met for <TaskInstance: bulk_replication_p2p_realtime.t1 2020-07-20T00:05:00+00:00 [queued]>
[2020-07-20 05:05:35,051] {__init__.py:1139} INFO - Dependencies all met for <TaskInstance: bulk_replication_p2p_realtime.t1 2020-07-20T00:05:00+00:00 [queued]>
[2020-07-20 05:05:35,051] {__init__.py:1353} INFO -
--------------------------------------------------------------------------------
[2020-07-20 05:05:35,051] {__init__.py:1354} INFO - Starting attempt 1 of 1
[2020-07-20 05:05:35,051] {__init__.py:1355} INFO -
--------------------------------------------------------------------------------
[2020-07-20 05:05:35,098] {__init__.py:1374} INFO - Executing <Task(PostgresOperator): t1> on 2020-07-20T00:05:00+00:00
[2020-07-20 05:05:35,099] {base_task_runner.py:119} INFO - Running: ['airflow', 'run', 'bulk_replication_p2p_realtime', 't1', '2020-07-20T00:05:00+00:00', '--job_id', '958216', '--raw', '-sd', 'DAGS_FOLDER/bulk_replication_p2p_realtime.py', '--cfg_path', '/tmp/tmph11tn6fe']
[2020-07-20 05:05:37,348] {base_task_runner.py:101} INFO - Job 958216: Subtask t1 [2020-07-20 05:05:37,347] {settings.py:182} INFO - settings.configure_orm(): Using pool settings. pool_size=10, pool_recycle=1800, pid=26244
[2020-07-20 05:05:39,503] {base_task_runner.py:101} INFO - Job 958216: Subtask t1 [2020-07-20 05:05:39,501] {__init__.py:51} INFO - Using executor LocalExecutor
[2020-07-20 05:05:39,857] {base_task_runner.py:101} INFO - Job 958216: Subtask t1 [2020-07-20 05:05:39,856] {__init__.py:305} INFO - Filling up the DagBag from /home/airflow/airflow/dags/bulk_replication_p2p_realtime.py
[2020-07-20 05:05:39,894] {base_task_runner.py:101} INFO - Job 958216: Subtask t1 [2020-07-20 05:05:39,894] {cli.py:517} INFO - Running <TaskInstance: bulk_replication_p2p_realtime.t1 2020-07-20T00:05:00+00:00 [running]> on host dwh2-airflow-dev
[2020-07-20 05:05:39,938] {postgres_operator.py:62} INFO - Executing: CALL dw_system.bulk_replicate(p_graph_name=>'replication_p2p_realtime',p_group_size=>4 , p_group=>1, p_dag_id=>'bulk_replication_p2p_realtime', p_task_id=>'t1')
[2020-07-20 05:05:39,960] {logging_mixin.py:95} INFO - [2020-07-20 05:05:39,953] {base_hook.py:83} INFO - Using connection to: id: postgres_warehouse. Host: XXX Port: 5432, Schema: XXXX Login: XXX Password: XXXXXXXX, extra: {}
[2020-07-20 05:05:39,973] {logging_mixin.py:95} INFO - [2020-07-20 05:05:39,972] {dbapi_hook.py:171} INFO - CALL dw_system.bulk_replicate(p_graph_name=>'replication_p2p_realtime',p_group_size=>4 , p_group=>1, p_dag_id=>'bulk_replication_p2p_realtime', p_task_id=>'t1')
[2020-07-20 05:23:21,450] {logging_mixin.py:95} INFO - [2020-07-20 05:23:21,449] {timeout.py:42} ERROR - Process timed out, PID: 26244
[2020-07-20 05:23:36,453] {logging_mixin.py:95} INFO - [2020-07-20 05:23:36,452] {jobs.py:2562} INFO - Task exited with return code 0
Does anyone know how to enforce execution timeout out for such long running functions? It seems that the execution timeout is evaluated once the PG function finish.
Airflow uses the signal module from the standard library to affect a timeout. In Airflow it's used to hook into these system signals and request that the calling process be notified in N seconds and, should the process still be inside the context (see the __enter__ and __exit__ methods on the class) it will raise an AirflowTaskTimeout exception.
Unfortunately for this situation, there are certain classes of system operations that cannot be interrupted. This is actually called out in the signal documentation:
A long-running calculation implemented purely in C (such as regular expression matching on a large body of text) may run uninterrupted for an arbitrary amount of time, regardless of any signals received. The Python signal handlers will be called when the calculation finishes.
To which we say "But I'm not doing a long-running calculation in C!" -- yeah for Airflow this is almost always due to uninterruptable I/O operations.
The highlighted sentence above (emphasis mine) nicely explains why the handler is still triggered even after the task is allowed to (frustratingly!) finish, well beyond your requested timeout.

Nexus 3 Compact blob store task does not remove images physically

We were deleting old docker images with keeping last 10 of them. We tried Compact blob store task to delete them physically but on the administration/Repository settings, Blob store still shows the same size after deleting images.
This is the compact blob store log:
2018-06-28 14:18:40,709+0200 INFO [quartz-6-thread-20] *SYSTEM org.sonatype.nexus.blobstore.compact.internal.CompactBlobStoreTask - Task information:
2018-06-28 14:18:40,712+0200 INFO [quartz-6-thread-20] *SYSTEM org.sonatype.nexus.blobstore.compact.internal.CompactBlobStoreTask - ID: 2bf9a574-f3e6-4f8e-8351-d98e4abc5103
2018-06-28 14:18:40,712+0200 INFO [quartz-6-thread-20] *SYSTEM org.sonatype.nexus.blobstore.compact.internal.CompactBlobStoreTask - Type: blobstore.compact
2018-06-28 14:18:40,712+0200 INFO [quartz-6-thread-20] *SYSTEM org.sonatype.nexus.blobstore.compact.internal.CompactBlobStoreTask - Name: cbs
2018-06-28 14:18:40,712+0200 INFO [quartz-6-thread-20] *SYSTEM org.sonatype.nexus.blobstore.compact.internal.CompactBlobStoreTask - Description: Compacting default blob store
2018-06-28 14:18:40,713+0200 INFO [quartz-6-thread-20] *SYSTEM org.sonatype.nexus.blobstore.file.FileBlobStore - Deletions index file rebuild not required
2018-06-28 14:18:40,713+0200 INFO [quartz-6-thread-20] *SYSTEM org.sonatype.nexus.blobstore.file.FileBlobStore - Begin deleted blobs processing
2018-06-28 14:18:41,551+0200 INFO [quartz-6-thread-20] *SYSTEM org.sonatype.nexus.blobstore.file.FileBlobStore - Elapsed time: 837.6 ms, processed: 45/45
2018-06-28 14:18:41,551+0200 INFO [quartz-6-thread-20] *SYSTEM org.sonatype.nexus.blobstore.compact.internal.CompactBlobStoreTask - Task complete
Docker layers can be shared across many different images, so the layers associated with an image are not deleted automatically when you delete an image. First run a "docker - delete unused manifests and images" task, then try running the compact blobstore again.

Resources