If I run DBI::dbGetQuery(sc, "SHOW DATABASES") in R, I get as result only default database.
And not the full list of hive tables created from the hive> command line...
Also in the R project dir, get's created a derby.log and metastore_db folder.
So my guess is that sparklyr's spark session is no using the global hive config...
I'm using Spark 3.3.0, Sparklyr 1.7.8 and MySQL for metastore...
I have tried changing sql.warehouse.dir to the value of hive's hive.metastore.warehouse.dir which is "/user/hive/warehouse" and sql.catalogImplementation to "hive".
options(sparklyr.log.console = TRUE)
sc_config <- spark_config()
sc_config$spark.sql.warehouse.dir <- "/user/hive/warehouse"
sc_config$spark.sql.catalogImplementation <- "hive"
sc <- spark_connect(master = "yarn", spark_home = "/home/ml/spark", app_name = "TestAPP", config = sc_config)
sparklyr::hive_context_config(sc)
This is the log from > sparklyr.log.console = TRUE:
22/10/18 11:11:43 INFO sparklyr: Session (97754) is starting under 127.0.0.1 port 8880
22/10/18 11:11:43 INFO sparklyr: Session (97754) found port 8880 is available
22/10/18 11:11:43 INFO sparklyr: Gateway (97754) is waiting for sparklyr client to connect to port 8880
22/10/18 11:11:43 INFO sparklyr: Gateway (97754) accepted connection
22/10/18 11:11:43 INFO sparklyr: Gateway (97754) is waiting for sparklyr client to connect to port 8880
22/10/18 11:11:43 INFO sparklyr: Gateway (97754) received command 0
22/10/18 11:11:43 INFO sparklyr: Gateway (97754) found requested session matches current session
22/10/18 11:11:43 INFO sparklyr: Gateway (97754) is creating backend and allocating system resources
22/10/18 11:11:43 INFO sparklyr: Gateway (97754) is using port 8881 for backend channel
22/10/18 11:11:44 INFO sparklyr: Gateway (97754) created the backend
22/10/18 11:11:44 INFO sparklyr: Gateway (97754) is waiting for R process to end
22/10/18 11:11:46 INFO HiveConf: Found configuration file null
22/10/18 11:11:46 INFO SparkContext: Running Spark version 3.3.0
22/10/18 11:11:46 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/10/18 11:11:47 INFO ResourceUtils: ==============================================================
22/10/18 11:11:47 INFO ResourceUtils: No custom resources configured for spark.driver.
22/10/18 11:11:47 INFO ResourceUtils: ==============================================================
22/10/18 11:11:47 INFO SparkContext: Submitted application: TestAPP
22/10/18 11:11:47 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 512, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
22/10/18 11:11:47 INFO ResourceProfile: Limiting resource is cpus at 1 tasks per executor
22/10/18 11:11:47 INFO ResourceProfileManager: Added ResourceProfile id: 0
22/10/18 11:11:48 INFO SecurityManager: Changing view acls to: ml
22/10/18 11:11:48 INFO SecurityManager: Changing modify acls to: ml
22/10/18 11:11:48 INFO SecurityManager: Changing view acls groups to:
22/10/18 11:11:48 INFO SecurityManager: Changing modify acls groups to:
22/10/18 11:11:48 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ml); groups with view permissions: Set(); users with modify permissions: Set(ml); groups with modify permissions: Set()
22/10/18 11:11:48 INFO Utils: Successfully started service 'sparkDriver' on port 38889.
22/10/18 11:11:48 INFO SparkEnv: Registering MapOutputTracker
22/10/18 11:11:48 INFO SparkEnv: Registering BlockManagerMaster
22/10/18 11:11:48 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
22/10/18 11:11:48 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
22/10/18 11:11:48 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
22/10/18 11:11:49 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-65ec8b4e-6131-4fed-a227-ea5b2162e4d8
22/10/18 11:11:49 INFO MemoryStore: MemoryStore started with capacity 93.3 MiB
22/10/18 11:11:49 INFO SparkEnv: Registering OutputCommitCoordinator
22/10/18 11:11:50 INFO Utils: Successfully started service 'SparkUI' on port 4040.
22/10/18 11:11:50 INFO SparkContext: Added JAR file:/home/ml/R/x86_64-pc-linux-gnu-library/4.2/sparklyr/java/sparklyr-master-2.12.jar at spark://master:38889/jars/sparklyr-master-2.12.jar with timestamp 1666116706621
22/10/18 11:11:51 INFO DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /0.0.0.0:8032
22/10/18 11:11:53 INFO Configuration: resource-types.xml not found
22/10/18 11:11:53 INFO ResourceUtils: Unable to find 'resource-types.xml'.
22/10/18 11:11:53 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
22/10/18 11:11:53 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
22/10/18 11:11:53 INFO Client: Setting up container launch context for our AM
22/10/18 11:11:53 INFO Client: Setting up the launch environment for our AM container
22/10/18 11:11:53 INFO Client: Preparing resources for our AM container
22/10/18 11:11:53 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
22/10/18 11:12:03 INFO Client: Uploading resource file:/tmp/spark-71575ad6-a8f7-43c0-974e-7c751281ef51/__spark_libs__890394313143327111.zip -> file:/home/ml/.sparkStaging/application_1665674177007_0028/__spark_libs__890394313143327111.zip
22/10/18 11:12:07 INFO Client: Uploading resource file:/tmp/spark-71575ad6-a8f7-43c0-974e-7c751281ef51/__spark_conf__9152665720324853254.zip -> file:/home/ml/.sparkStaging/application_1665674177007_0028/__spark_conf__.zip
22/10/18 11:12:08 INFO SecurityManager: Changing view acls to: ml
22/10/18 11:12:08 INFO SecurityManager: Changing modify acls to: ml
22/10/18 11:12:08 INFO SecurityManager: Changing view acls groups to:
22/10/18 11:12:08 INFO SecurityManager: Changing modify acls groups to:
22/10/18 11:12:08 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ml); groups with view permissions: Set(); users with modify permissions: Set(ml); groups with modify permissions: Set()
22/10/18 11:12:08 INFO Client: Submitting application application_1665674177007_0028 to ResourceManager
22/10/18 11:12:08 INFO YarnClientImpl: Submitted application application_1665674177007_0028
22/10/18 11:12:09 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:09 INFO Client:
client token: N/A
diagnostics: [Tue Oct 18 11:12:08 -0700 2022] Application is Activated, waiting for resources to be assigned for AM. Details : AM Partition = <DEFAULT_PARTITION> ; Partition Resource = <memory:16384, vCores:16> ; Queue's Absolute capacity = 100.0 % ; Queue's Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; Queue's capacity (absolute resource) = <memory:16384, vCores:16> ; Queue's used capacity (absolute resource) = <memory:0, vCores:0> ; Queue's max capacity (absolute resource) = <memory:16384, vCores:16> ;
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1666116728172
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1665674177007_0028/
user: ml
22/10/18 11:12:10 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:11 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:12 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:13 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:14 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:15 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:16 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:17 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:18 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:19 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:20 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:21 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:22 INFO Client: Application report for application_1665674177007_0028 (state: ACCEPTED)
22/10/18 11:12:23 INFO Client: Application report for application_1665674177007_0028 (state: RUNNING)
22/10/18 11:12:23 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.1.82
ApplicationMaster RPC port: -1
queue: default
start time: 1666116728172
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1665674177007_0028/
user: ml
22/10/18 11:12:23 INFO YarnClientSchedulerBackend: Application application_1665674177007_0028 has started running.
22/10/18 11:12:23 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43035.
22/10/18 11:12:23 INFO NettyBlockTransferService: Server created on master:43035
22/10/18 11:12:23 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/10/18 11:12:23 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, master, 43035, None)
22/10/18 11:12:23 INFO BlockManagerMasterEndpoint: Registering block manager master:43035 with 93.3 MiB RAM, BlockManagerId(driver, master, 43035, None)
22/10/18 11:12:23 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, master, 43035, None)
22/10/18 11:12:23 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, master, 43035, None)
22/10/18 11:12:23 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> master, PROXY_URI_BASES -> http://master:8088/proxy/application_1665674177007_0028), /proxy/application_1665674177007_0028
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /jobs: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /jobs/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /jobs/job: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /jobs/job/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /stages: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /stages/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /stages/stage: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /stages/stage/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /stages/pool: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /stages/pool/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /storage: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /storage/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /storage/rdd: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /storage/rdd/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /environment: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /environment/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /executors: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /executors/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /executors/threadDump: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /executors/threadDump/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:24 INFO ServerInfo: Adding filter to /static: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO ServerInfo: Adding filter to /: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO ServerInfo: Adding filter to /api: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO ServerInfo: Adding filter to /jobs/job/kill: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO ServerInfo: Adding filter to /stages/stage/kill: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO ServerInfo: Adding filter to /metrics/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000000000(ns)
22/10/18 11:12:25 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir.
22/10/18 11:12:25 INFO SharedState: Warehouse path is 'file:/user/hive/warehouse'.
22/10/18 11:12:25 INFO ServerInfo: Adding filter to /SQL: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO ServerInfo: Adding filter to /SQL/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO ServerInfo: Adding filter to /SQL/execution: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO ServerInfo: Adding filter to /SQL/execution/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO ServerInfo: Adding filter to /static/sql: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
22/10/18 11:12:25 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
22/10/18 11:12:29 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Requesting driver to remove executor 1 for reason Container from a bad node: container_1665674177007_0028_02_000002 on host: worker1. Exit status: -1000. Diagnostics: [2022-10-18 11:12:26.949]File file:/home/ml/.sparkStaging/application_1665674177007_0028/__spark_libs__890394313143327111.zip does not exist
java.io.FileNotFoundException: File file:/home/ml/.sparkStaging/application_1665674177007_0028/__spark_libs__890394313143327111.zip does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:779)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:1100)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:769)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:462)
at org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:271)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:68)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:415)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:412)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:412)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:247)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:240)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:228)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
.
22/10/18 11:12:29 INFO BlockManagerMaster: Removal of executor 1 requested
22/10/18 11:12:29 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asked to remove non-existent executor 1
22/10/18 11:12:29 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster.
22/10/18 11:12:39 INFO HiveUtils: Initializing HiveMetastoreConnection version 2.3.9 using Spark classes.
22/10/18 11:12:40 INFO HiveClientImpl: Warehouse location for Hive client (version 2.3.9) is file:/user/hive/warehouse
22/10/18 11:12:41 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.1.82:43560) with ID 2, ResourceProfileId 0
22/10/18 11:12:42 INFO BlockManagerMasterEndpoint: Registering block manager master:40397 with 93.3 MiB RAM, BlockManagerId(2, master, 40397, None)
22/10/18 11:12:49 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.1.82:43600) with ID 3, ResourceProfileId 0
22/10/18 11:12:50 INFO BlockManagerMasterEndpoint: Registering block manager master:44035 with 93.3 MiB RAM, BlockManagerId(3, master, 44035, None)
And this is the print from > sparklyr::hive_context_config(sc): https://pastebin.com/e28KJ4wQ
Any help?
Thanks in advance.
Okay so I found the solution on this other question.
I added this property to my hive-site.xml and also copied it to $HOME_SPARK/conf/
<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9083</value>
</property>
Also I removed all spark_config() configs that I tried before.
I would love to know why was this the solution.
When the spark session is created, you could get configure "spark.sql.catalogImplementation" to see if it is "hive".
If spark can not find the hive class, it will change the config to "in-memory".
In org.apache.spark.repl.Main:
if (conf.get(CATALOG_IMPLEMENTATION.key, "hive").toLowerCase(Locale.ROOT) == "hive") {
if (SparkSession.hiveClassesArePresent) {
// In the case that the property is not set at all, builder's config
// does not have this value set to 'hive' yet. The original default
// behavior is that when there are hive classes, we use hive catalog.
sparkSession = builder.enableHiveSupport().getOrCreate()
logInfo("Created Spark session with Hive support")
} else {
// Need to change it back to 'in-memory' if no hive classes are found
// in the case that the property is set to hive in spark-defaults.conf
builder.config(CATALOG_IMPLEMENTATION.key, "in-memory")
sparkSession = builder.getOrCreate()
logInfo("Created Spark session")
}
} else {
// In the case that the property is set but not to 'hive', the internal
// default is 'in-memory'. So the sparkSession will use in-memory catalog.
sparkSession = builder.getOrCreate()
logInfo("Created Spark session")
}
Related
When trying to start Yagna I receive this error, what can I do? I can probably get some DEBUG logs if needed?
[2021-05-06T08:45:08Z INFO yagna] Starting yagna service! Version: 0.6.4 (4fc72117 2021-04-15 build #135).
Log is written to /home/user/.local/share/yagna/yagna_rCURRENT.log
[2021-05-06T08:45:08Z INFO yagna] Data directory: /home/user/.local/share/yagna
[2021-05-06T08:45:08Z INFO ya_sb_router::unix] Router listening on: "/tmp/yagna.sock"
[2021-05-06T08:45:08Z INFO ya_persistence::executor] using database at: /home/user/.local/share/yagna/yagna.db
[2021-05-06T08:45:08Z INFO ya_persistence::executor] using database at: /home/user/.local/share/yagna/market.db
[2021-05-06T08:45:08Z INFO ya_persistence::executor] using database at: /home/user/.local/share/yagna/activity.db
[2021-05-06T08:45:08Z INFO ya_persistence::executor] using database at: /home/user/.local/share/yagna/payment.db
[2021-05-06T08:45:08Z INFO ya_identity::service::identity] using default identity: 0xf5ecffecf053508fe97255e046a04ce21c8ee525
[2021-05-06T08:45:08Z INFO yagna] Identity GSB service successfully activated
[2021-05-06T08:45:08Z INFO ya_metrics::pusher] Metrics pusher started
[2021-05-06T08:45:08Z INFO yagna] Metrics GSB service successfully activated
[2021-05-06T08:45:08Z INFO ya_service_bus::remote_router] trying to connect to: /tmp/yagna.sock
[2021-05-06T08:45:08Z INFO ya_service_bus::connection] started connection to gsb
[2021-05-06T08:45:08Z INFO ya_metrics::pusher] Starting metrics pusher
[2021-05-06T08:45:10Z INFO yagna] Version GSB service successfully activated
[2021-05-06T08:45:10Z INFO ya_net::service] using default identity as network id: 0xf5ecffecf053508fe97255e046a04ce21c8ee525
[2021-05-06T08:45:10Z WARN ya_net::handler] Failed to bind handlers: DNS Error: Not Implemented; retrying in 2 s
[2021-05-06T08:45:12Z WARN ya_net::handler] Failed to bind handlers: DNS Error: Not Implemented; retrying in 4 s
[2021-05-06T08:45:16Z WARN ya_net::handler] Failed to bind handlers: DNS Error: Not Implemented; retrying in 8 s
EDIT: nslookup
Server: 10.139.1.1
Address: 10.139.1.1#53
** server can't find _net._tcp.dev.golem.network: NOTIMP
I'm not sure what is the reason here, but it seems like DNS is not able to resolve _net._tcp.dev.golem.network SRV record yielding 'Not Implemented'. It is very odd, since Yagna is using Google's DNS servers as a default.
When you face this again pls try to check output of following command
nslookup -q=SRV _net._tcp.dev.golem.network 8.8.8.8
The user has trouble reaching Google's DNS with nslookup, so it appears to be something on his end. He is also using a proxy for his connection, so it must happen somewhere in there. Closing thread.
I've set 'execution_timeout': timedelta(seconds=300) parameter on many tasks. When the execution timeout is set on task downloading data from Google Analytics it works properly - after ~300 seconds is the task set to failed. The task downloads some data from API (python), then it does some transformations (python) and loads data into PostgreSQL.
Then I've a task which executes only one PostgreSQL function - execution sometimes takes more than 300 seconds but I get this (task is marked as finished successfully).
*** Reading local file: /home/airflow/airflow/logs/bulk_replication_p2p_realtime/t1/2020-07-20T00:05:00+00:00/1.log
[2020-07-20 05:05:35,040] {__init__.py:1139} INFO - Dependencies all met for <TaskInstance: bulk_replication_p2p_realtime.t1 2020-07-20T00:05:00+00:00 [queued]>
[2020-07-20 05:05:35,051] {__init__.py:1139} INFO - Dependencies all met for <TaskInstance: bulk_replication_p2p_realtime.t1 2020-07-20T00:05:00+00:00 [queued]>
[2020-07-20 05:05:35,051] {__init__.py:1353} INFO -
--------------------------------------------------------------------------------
[2020-07-20 05:05:35,051] {__init__.py:1354} INFO - Starting attempt 1 of 1
[2020-07-20 05:05:35,051] {__init__.py:1355} INFO -
--------------------------------------------------------------------------------
[2020-07-20 05:05:35,098] {__init__.py:1374} INFO - Executing <Task(PostgresOperator): t1> on 2020-07-20T00:05:00+00:00
[2020-07-20 05:05:35,099] {base_task_runner.py:119} INFO - Running: ['airflow', 'run', 'bulk_replication_p2p_realtime', 't1', '2020-07-20T00:05:00+00:00', '--job_id', '958216', '--raw', '-sd', 'DAGS_FOLDER/bulk_replication_p2p_realtime.py', '--cfg_path', '/tmp/tmph11tn6fe']
[2020-07-20 05:05:37,348] {base_task_runner.py:101} INFO - Job 958216: Subtask t1 [2020-07-20 05:05:37,347] {settings.py:182} INFO - settings.configure_orm(): Using pool settings. pool_size=10, pool_recycle=1800, pid=26244
[2020-07-20 05:05:39,503] {base_task_runner.py:101} INFO - Job 958216: Subtask t1 [2020-07-20 05:05:39,501] {__init__.py:51} INFO - Using executor LocalExecutor
[2020-07-20 05:05:39,857] {base_task_runner.py:101} INFO - Job 958216: Subtask t1 [2020-07-20 05:05:39,856] {__init__.py:305} INFO - Filling up the DagBag from /home/airflow/airflow/dags/bulk_replication_p2p_realtime.py
[2020-07-20 05:05:39,894] {base_task_runner.py:101} INFO - Job 958216: Subtask t1 [2020-07-20 05:05:39,894] {cli.py:517} INFO - Running <TaskInstance: bulk_replication_p2p_realtime.t1 2020-07-20T00:05:00+00:00 [running]> on host dwh2-airflow-dev
[2020-07-20 05:05:39,938] {postgres_operator.py:62} INFO - Executing: CALL dw_system.bulk_replicate(p_graph_name=>'replication_p2p_realtime',p_group_size=>4 , p_group=>1, p_dag_id=>'bulk_replication_p2p_realtime', p_task_id=>'t1')
[2020-07-20 05:05:39,960] {logging_mixin.py:95} INFO - [2020-07-20 05:05:39,953] {base_hook.py:83} INFO - Using connection to: id: postgres_warehouse. Host: XXX Port: 5432, Schema: XXXX Login: XXX Password: XXXXXXXX, extra: {}
[2020-07-20 05:05:39,973] {logging_mixin.py:95} INFO - [2020-07-20 05:05:39,972] {dbapi_hook.py:171} INFO - CALL dw_system.bulk_replicate(p_graph_name=>'replication_p2p_realtime',p_group_size=>4 , p_group=>1, p_dag_id=>'bulk_replication_p2p_realtime', p_task_id=>'t1')
[2020-07-20 05:23:21,450] {logging_mixin.py:95} INFO - [2020-07-20 05:23:21,449] {timeout.py:42} ERROR - Process timed out, PID: 26244
[2020-07-20 05:23:36,453] {logging_mixin.py:95} INFO - [2020-07-20 05:23:36,452] {jobs.py:2562} INFO - Task exited with return code 0
Does anyone know how to enforce execution timeout out for such long running functions? It seems that the execution timeout is evaluated once the PG function finish.
Airflow uses the signal module from the standard library to affect a timeout. In Airflow it's used to hook into these system signals and request that the calling process be notified in N seconds and, should the process still be inside the context (see the __enter__ and __exit__ methods on the class) it will raise an AirflowTaskTimeout exception.
Unfortunately for this situation, there are certain classes of system operations that cannot be interrupted. This is actually called out in the signal documentation:
A long-running calculation implemented purely in C (such as regular expression matching on a large body of text) may run uninterrupted for an arbitrary amount of time, regardless of any signals received. The Python signal handlers will be called when the calculation finishes.
To which we say "But I'm not doing a long-running calculation in C!" -- yeah for Airflow this is almost always due to uninterruptable I/O operations.
The highlighted sentence above (emphasis mine) nicely explains why the handler is still triggered even after the task is allowed to (frustratingly!) finish, well beyond your requested timeout.
I was originally using https://docs.wso2.com/display/CLUSTER44x/Configuring+the+Pre-Packaged+Identity+Server+5.2.0+with+API+Manager+2.0.0.
Now i migrated to API manager 2.1.0 and Id server 5.3.0
After migrating id server , I am unable to login. even for the admin/admin , I am unable to login.
The below error is shown
[2018-05-01 05:36:53,528] WARN {java.util.prefs.FileSystemPreferences} - Couldn't flush system prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
[2018-05-01 05:36:36,073] WARN {org.wso2.carbon.core.services.util.CarbonAuthenticationUtil} - Failed Administrator login attempt 'admin[-1234]' at [2018-05-01 05:36:36,073+0000]
I dont have any other errors in the log.But server takes long to start up because there is some messages like below
[2018-05-01 05:30:30,549] WARN {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - Waiting for required OSGi services: org.wso2.carbon.server.admin.common.IServerAdmin,
[2018-05-01 05:31:30,548] WARN {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - Waiting for required OSGi services: org.wso2.carbon.server.admin.common.IServerAdmin,
[2018-05-01 05:32:30,548] WARN {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - Waiting for required OSGi services: org.wso2.carbon.server.admin.common.IServerAdmin,
[2018-05-01 05:33:30,548] WARN {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - Waiting for required OSGi services: org.wso2.carbon.server.admin.common.IServerAdmin,
[2018-05-01 05:34:30,548] WARN {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - Waiting for required OSGi services: org.wso2.carbon.server.admin.common.IServerAdmin,
Any help plz
can you check whether you have configured the user-mgt.xml in /repository/conf location please. check whether it is same in previous version and new version. Also check whether you configured the user store correctly in the repository/conf/datasources/master-datasources.xml file
I am getting the below error when starting the Realm object server. I am running the server on Mac.
Realm Mobile Platform version is 1.8.1
2017-07-08T09:48:06.362Z - info: Logging to console at level 'info'.
2017-07-08T09:48:06.438Z - info: Realm Object Server sync engine listening on 127.0.0.1:27800.
2017-07-08T09:48:06.484Z - info: permission: Seed permission-Realms
2017-07-08T09:48:06.496Z - info: Realm Object Server web server listening on 127.0.0.1:27080.
2017-07-08T09:48:06.498Z - info: http proxy listening on :::9080.
2017-07-08T09:48:06.503Z - info: client: Opening Realm file: /Users/vkuppusamy/Documents/realm-mobile-platform/realm-object-server/object-server/root_dir/internal_data/auth.realm
2017-07-08T09:48:06.503Z - info: client: Connection[1]: Session[1]: Starting session for '/Users/vkuppusamy/Documents/realm-mobile-platform/realm-object-server/object-server/root_dir/internal_data/auth.realm'
2017-07-08T09:48:06.503Z - info: client: Connection[1]: Resolving ':::9080'
2017-07-08T09:48:06.503Z - info: client: Connection[1]: Connecting to endpoint ':::9080' (1/1)
2017-07-08T09:48:06.503Z - error: client: Connection[1]: Failed to connect to endpoint ':::9080': Connection refused
2017-07-08T09:48:06.503Z - error: client: Connection[1]: Failed to connect to ':::9080': All endpoints failed
2017-07-08T09:48:06.504Z - info: client: Opening Realm file: /Users/vkuppusamy/Documents/realm-mobile-platform/realm-object-server/object-server/realm-object-server/listener/__admin.realm
2017-07-08T09:48:06.504Z - info: client: Connection[2]: Session[2]: Starting session for '/Users/vkuppusamy/Documents/realm-mobile-platform/realm-object-server/object-server/realm-object-server/listener/_
You can ignore that - it's likely due to clients trying to connect before the server has initialized completely.
I have Spark 1.4.1 installed on an Hadoop 2.7 cluster.
I've started the SparkR shell without error:
bin/sparkR --master yarn-client
I've run the R command without error (introductory example from spark.apache.org) :
df <- createDataFrame(sqlContext, faithful)
When I run the command :
head(select(df, df$eruptions))
I get the following error at 15/09/02 10:08:29 on the executor node :
"Rscript execution error: No such file or directory".
Any hint would be greatly appreciated.
Spark tasks other than SparkR run OK on my YARN cluster.
R 3.2.1 is installed and runs OK on the driver node.
15/09/02 10:04:06 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]
15/09/02 10:04:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/09/02 10:04:10 INFO spark.SecurityManager: Changing view acls to: yarn,root
15/09/02 10:04:10 INFO spark.SecurityManager: Changing modify acls to: yarn,root
15/09/02 10:04:10 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); users with modify permissions: Set(yarn, root)
15/09/02 10:04:11 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/09/02 10:04:12 INFO Remoting: Starting remoting
15/09/02 10:04:12 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher#datanode1.hp.com:46167]
15/09/02 10:04:12 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 46167.
15/09/02 10:04:12 INFO spark.SecurityManager: Changing view acls to: yarn,root
15/09/02 10:04:12 INFO spark.SecurityManager: Changing modify acls to: yarn,root
15/09/02 10:04:12 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, root); users with modify permissions: Set(yarn, root)
15/09/02 10:04:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/09/02 10:04:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/09/02 10:04:12 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
15/09/02 10:04:12 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/09/02 10:04:12 INFO Remoting: Starting remoting
15/09/02 10:04:13 INFO util.Utils: Successfully started service 'sparkExecutor' on port 47919.
15/09/02 10:04:13 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor#datanode1.hp.com:47919]
15/09/02 10:04:13 INFO storage.DiskBlockManager: Created local directory at /data2/hadoop/yarn/local/usercache/root/appcache/application_1441180800595_0001/blockmgr-5e435e40-bd36-4746-9acd-8cf1619033ae
15/09/02 10:04:13 INFO storage.DiskBlockManager: Created local directory at /data3/hadoop/yarn/local/usercache/root/appcache/application_1441180800595_0001/blockmgr-28dfabe6-8e0d-4e49-bc95-27b3428c10a0
15/09/02 10:04:13 INFO storage.MemoryStore: MemoryStore started with capacity 534.5 MB
15/09/02 10:04:13 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sparkDriver#192.1.1.1:45596/user/CoarseGrainedScheduler
15/09/02 10:04:13 INFO executor.CoarseGrainedExecutorBackend: Successfully registered with driver
15/09/02 10:04:13 INFO executor.Executor: Starting executor ID 2 on host datanode1.hp.com
15/09/02 10:04:14 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34166.
15/09/02 10:04:14 INFO netty.NettyBlockTransferService: Server created on 34166
15/09/02 10:04:14 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/09/02 10:04:14 INFO storage.BlockManagerMaster: Registered BlockManager
15/09/02 10:06:35 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 0
15/09/02 10:06:35 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/09/02 10:06:35 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0
15/09/02 10:06:35 INFO storage.MemoryStore: ensureFreeSpace(854) called with curMem=0, maxMem=560497950
15/09/02 10:06:35 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 854.0 B, free 534.5 MB)
15/09/02 10:06:35 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 159 ms
15/09/02 10:06:35 INFO storage.MemoryStore: ensureFreeSpace(1280) called with curMem=854, maxMem=560497950
15/09/02 10:06:35 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1280.0 B, free 534.5 MB)
15/09/02 10:06:35 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 0). 11589 bytes result sent to driver
15/09/02 10:08:28 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1
15/09/02 10:08:28 INFO executor.Executor: Running task 0.0 in stage 1.0 (TID 1)
15/09/02 10:08:28 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 1
15/09/02 10:08:28 INFO storage.MemoryStore: ensureFreeSpace(4022) called with curMem=0, maxMem=560497950
15/09/02 10:08:28 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 3.9 KB, free 534.5 MB)
15/09/02 10:08:28 INFO broadcast.TorrentBroadcast: Reading broadcast variable 1 took 13 ms
15/09/02 10:08:28 INFO storage.MemoryStore: ensureFreeSpace(9536) called with curMem=4022, maxMem=560497950
15/09/02 10:08:28 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 9.3 KB, free 534.5 MB)
15/09/02 10:08:29 INFO r.BufferedStreamThread: Rscript execution error: No such file or directory
15/09/02 10:08:39 ERROR executor.Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.net.SocketTimeoutException: Accept timed out
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
at java.net.ServerSocket.implAccept(ServerSocket.java:530)
at java.net.ServerSocket.accept(ServerSocket.java:498)
at org.apache.spark.api.r.RRDD$.createRWorker(RRDD.scala:425)
at org.apache.spark.api.r.BaseRRDD.compute(RRDD.scala:63)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)