data channel lock error while configuring flume with multiple channels - flume-ng

I have tried to fan out the flow from one source to two channels.Also I specified different dataDirs and checkpointDirs properties for each channel as in the channel lock error while configuring flume's multiple sources using FILE channels question.I have used a multiplexing channel selector. I have get the following error.
18/08/23 16:21:37 **ERROR file.FileChannel: Failed to start the file channel** [channel=fileChannel1_2]
java.io.IOException: Cannot lock /root/.flume/file-channel/data. The directory is already locked. [channel=fileChannel1_2]
at org.apache.flume.channel.file.Log.lock(Log.java:1169)
at org.apache.flume.channel.file.Log.<init>(Log.java:336)
at org.apache.flume.channel.file.Log.<init>(Log.java:76)
at org.apache.flume.channel.file.Log$Builder.build(Log.java:276)
at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:281)
at unAndReset(FutureTask.java:308) .....
My configuration file as follws.
agent1.sinks=hdfs-sink1_1 hdfs-sink1_2
agent1.sources=source1_1
agent1.channels=fileChannel1_1 fileChannel1_2
agent1.channels.fileChannel1_1.type=file
agent1.channels.fileChannel1_1.checkpointDir=/home/Flume/alpha/001
agent1.channels.fileChannel1_1.dataDir=/mnt/alpha_data/
agent1.channels.fileChannel1_1.checkpointOnClose=true
agent1.channels.fileChannel1_1.dataOnClose=true
agent1.sources.source1_1.type=spooldir
agent1.sources.source1_1.spoolDir=/home/ABC/
agent1.sources.source1_1.recursiveDirectorySearch=true
agent1.sources.source1_1.fileSuffix=.COMPLETED
agent1.sources.source1_1.basenameHeader = true
agent1.sinks.hdfs-sink1_1.type=hdfs
agent1.sinks.hdfs-sink1_1.hdfs.filePrefix = %{basename}
agent1.sinks.hdfs-sink1_1.hdfs.path=hdfs://10.44.209.44:9000/flume_sink/CA
agent1.sinks.hdfs-sink1_1.hdfs.batchSize=1000
agent1.sinks.hdfs-sink1_1.hdfs.rollSize=268435456
agent1.sinks.hdfs-sink1_1.hdfs.rollInterval=0
agent1.sinks.hdfs-sink1_1.hdfs.rollCount=50000000
agent1.sinks.hdfs-sink1_1.hdfs.fileType=DataStream
agent1.sinks.hdfs-sink1_1.hdfs.writeFormat=Text
agent1.sinks.hdfs-sink1_1.hdfs.useLocalTimeStamp=false
agent1.channels.fileChannel1_2.type=file
agent1.channels.fileChannel1_2.capacity=200000
agent1.channels.fileChannel1_2.transactionCapacity=1000
agent1.channels.fileChannel1_2.checkpointDir=/home/Flume/beta/001
agent1.channels.fileChannel1_2.dataDir=/mnt/beta_data/
agent1.channels.fileChannel1_2.checkpointOnClose=true
agent1.channels.fileChannel1_2.dataOnClose=true
agent1.sinks.hdfs-sink1_2.type=hdfs
agent1.sinks.hdfs-sink1_2.hdfs.filePrefix = %{basename}
agent1.sinks.hdfs-sink1_2.hdfs.path=hdfs://10.44.209.44:9000/flume_sink/AZ
agent1.sinks.hdfs-sink1_2.hdfs.batchSize=1000
agent1.sinks.hdfs-sink1_2.hdfs.rollSize=268435456
agent1.sinks.hdfs-sink1_2.hdfs.rollInterval=0
agent1.sinks.hdfs-sink1_2.hdfs.rollCount=50000000
agent1.sinks.hdfs-sink1_2.hdfs.fileType=DataStream
agent1.sinks.hdfs-sink1_2.hdfs.writeFormat=Text
agent1.sinks.hdfs-sink1_2.hdfs.useLocalTimeStamp=false
agent1.sources.source1_1.channels=fileChannel1_1 fileChannel1_2
agent1.sinks.hdfs-sink1_1.channel=fileChannel1_1
agent1.sinks.hdfs-sink1_2.channel=fileChannel1_2
agent1.sources.source1_1.selector.type=multiplexing
agent1.sources.source1_1.selector.header=basenameHeader
agent1.sources.source1_1.selector.mapping.CA=fileChannel1_1
agent1.sources.source1_1.selector.mapping.AZ=fileChannel1_2
can someone give any solution for that.

Try to set a channel for default property in multiplexing selector
agent1.sources.source1_1.selector.default=fileChannel1_1

Data channel lock error was corrected. But still could't do the multiplexing. Code as follows.
agent1.sinks=hdfs-sink1_1 hdfs-sink1_2 hdfs-sink1_3
agent1.sources=source1_1
agent1.channels=fileChannel1_1 fileChannel1_2 fileChannel1_3
agent1.channels.fileChannel1_1.type=file
agent1.channels.fileChannel1_1.capacity=200000
agent1.channels.fileChannel1_1.transactionCapacity=1000
agent1.channels.fileChannel1_1.checkpointDir=/home/Flume/alpha/001
agent1.channels.fileChannel1_1.dataDirs=/home/Flume/alpha_data
agent1.channels.fileChannel1_1.checkpointOnClose=true
agent1.channels.fileChannel1_1.dataOnClose=true
agent1.sources.source1_1.type=spooldir
agent1.sources.source1_1.spoolDir=/home/ABC/
agent1.sources.source1_1.recursiveDirectorySearch=true
agent1.sources.source1_1.fileSuffix=.COMPLETED
agent1.sources.source1_1.basenameHeader = true
agent1.sources.source1_1.basenameHeaderKey = basename
agent1.sinks.hdfs-sink1_1.type=hdfs
agent1.sinks.hdfs-sink1_1.hdfs.filePrefix = %{basename}
agent1.sinks.hdfs-sink1_1.hdfs.path=hdfs://10.44.209.44:9000/flume_sink/CA
agent1.sinks.hdfs-sink1_1.hdfs.batchSize=1000
agent1.sinks.hdfs-sink1_1.hdfs.rollSize=268435456
agent1.sinks.hdfs-sink1_1.hdfs.rollInterval=0
agent1.sinks.hdfs-sink1_1.hdfs.rollCount=50000000
agent1.sinks.hdfs-sink1_1.hdfs.fileType=DataStream
agent1.sinks.hdfs-sink1_1.hdfs.writeFormat=Text
agent1.sinks.hdfs-sink1_1.hdfs.useLocalTimeStamp=false
agent1.channels.fileChannel1_2.type=file
agent1.channels.fileChannel1_2.capacity=200000
agent1.channels.fileChannel1_2.transactionCapacity=1000
agent1.channels.fileChannel1_2.checkpointDir=/home/Flume/beta/001
agent1.channels.fileChannel1_2.dataDirs=/home/Flume/beta_data
agent1.channels.fileChannel1_2.checkpointOnClose=true
agent1.channels.fileChannel1_2.dataOnClose=true
agent1.sinks.hdfs-sink1_2.type=hdfs
agent1.sinks.hdfs-sink1_2.hdfs.filePrefix = %{basename}
agent1.sinks.hdfs-sink1_2.hdfs.path=hdfs://10.44.209.44:9000/flume_sink/AZ
agent1.sinks.hdfs-sink1_2.hdfs.batchSize=1000
agent1.sinks.hdfs-sink1_2.hdfs.rollSize=268435456
agent1.sinks.hdfs-sink1_2.hdfs.rollInterval=0
agent1.sinks.hdfs-sink1_2.hdfs.rollCount=50000000
agent1.sinks.hdfs-sink1_2.hdfs.fileType=DataStream
agent1.sinks.hdfs-sink1_2.hdfs.writeFormat=Text
agent1.sinks.hdfs-sink1_2.hdfs.useLocalTimeStamp=false
agent1.channels.fileChannel1_3.type=file
agent1.channels.fileChannel1_3.capacity=200000
agent1.channels.fileChannel1_3.transactionCapacity=10
agent1.channels.fileChannel1_3.checkpointDir=/home/Flume/gamma/001
agent1.channels.fileChannel1_3.dataDirs=/home/Flume/gamma_data
agent1.channels.fileChannel1_3.checkpointOnClose=true
agent1.channels.fileChannel1_3.dataOnClose=true
agent1.sinks.hdfs-sink1_3.type=hdfs
agent1.sinks.hdfs-sink1_3.hdfs.filePrefix = %{basename}
agent1.sinks.hdfs-sink1_3.hdfs.path=hdfs://10.44.209.44:9000/flume_sink/KT
agent1.sinks.hdfs-sink1_3.hdfs.batchSize=1000
agent1.sinks.hdfs-sink1_3.hdfs.rollSize=268435456
agent1.sinks.hdfs-sink1_3.hdfs.rollInterval=0
agent1.sinks.hdfs-sink1_3.hdfs.rollCount=50000000
agent1.sinks.hdfs-sink1_3.hdfs.fileType=DataStream
agent1.sinks.hdfs-sink1_3.hdfs.writeFormat=Text
agent1.sinks.hdfs-sink1_3.hdfs.useLocalTimeStamp=false
agent1.sources.source1_1.channels=fileChannel1_1 fileChannel1_2 fileChannel1_3
agent1.sinks.hdfs-sink1_1.channel=fileChannel1_1
agent1.sinks.hdfs-sink1_2.channel=fileChannel1_2
agent1.sinks.hdfs-sink1_3.channel=fileChannel1_3
agent1.sources.source1_1.selector.type=replicating
agent1.sources.source1_1.selector.header=basename
agent1.sources.source1_1.selector.mapping.CA=fileChannel1_1
agent1.sources.source1_1.selector.mapping.AZ=fileChannel1_2
agent1.sources.source1_1.selector.default=fileChannel1_3

Related

Active BLE Scanning (BlueZ) - Issue with DBus

I've started a project where I need to actively (all the time) scan for BLE Devices. I'm on Linux, using Bluez 5.49 and I use Python to communicate with dbus 1.10.20).
I' m able to start scanning, stop scanning with bluetoothctl and get the BLE Advertisement data through DBus (GetManagedObjects() of the BlueZ interface). The problem I have is when I let the scanning for many hours, dbus-deamon start to take more and more of the RAM and I'm not able to find how to "flush" what dbus has gathered from BlueZ. Eventually the RAM become full and Linux isn't happy.
So I've tried not to scan for the entire time, that would maybe let the Garbage collector do its cleanup. It didn't work.
I've edited the /etc/dbus-1/system.d/bluetooth.conf to remove any interface that I didn't need
<policy user="root">
<allow own="org.bluez"/>
<allow send_destination="org.bluez"/>
</policy>
That has slow down the RAM build-up but didn't solve the issue.
I've found a way to inspect which connection has byte waiting and confirmed that it comes from blueZ
Connection :1.74 with pid 3622 '/usr/libexec/bluetooth/bluetoothd --experimental ' (org.bluez):
IncomingBytes=1253544
PeakIncomingBytes=1313072
OutgoingBytes=0
PeakOutgoingBytes=210
and lastly, I've found that someone needs to read what is waiting in DBus in order to free the memory. So I've found this : https://stackoverflow.com/a/60665430/15325057
And I receive the data that BlueZ is sending over but the memory still built-up.
The only way I know to free up dbus is to reboot linux. which is not ideal.
I'm coming at the end of what I understand of DBus and that's why I'm here today.
If you have any insight that could help me to free dbus from BlueZ messages, it would be highly appreciated.
Thanks in advance
EDIT Adding the DBus code i use to read the discovered devices:
#!/usr/bin/python3
import dbus
BLUEZ_SERVICE_NAME = "org.bluez"
DBUS_OM_IFACE = "org.freedesktop.DBus.ObjectManager"
DEVICES_IFACE = "org.bluez.Device1"
def main_loop(subproc):
devinfo = None
objects = None
dbussys = dbus.SystemBus()
dbusconnection = dbussys.get_object(BLUEZ_SERVICE_NAME, "/")
bluezInterface = dbus.Interface(dbusconnection, DBUS_OM_IFACE)
while True:
try:
objects = bluezInterface.GetManagedObjects()
except dbus.DBusException as err:
print("dbus Error : " + str(err))
pass
all_devices = (str(path) for path, interfaces in objects.items() if DEVICES_IFACE in interfaces.keys())
for path, interfaces in objects.items():
if "org.bluez.Adapter1" not in interfaces.keys():
continue
device_list = [d for d in all_devices if d.startswith(path + "/")]
for dev_path in device_list:
properties = objects[dev_path][DEVICES_IFACE]
if "ServiceData" in properties.keys() and "Name" in properties.keys() and "RSSI" in properties.keys():
#[... Do someting...]
Indeed, Bluez flushes memory when you stop discovering. So in order to scan continuously you need start and stop the discovery all the time. I discover for 6 seconds, wait 1 second and then start discovering for 6 seconds again...and so on. If you check the logs you will see it deletes a lot of stuff when stopping discovery.
I can't really reproduce your error exactly but my system is not happy running that fast while loop repeatedly getting the data from GetManagedObjects.
Below is the code I ran based on your code with a little bit of refactoring...
import dbus
BLUEZ_SERVICE_NAME = "org.bluez"
DBUS_OM_IFACE = "org.freedesktop.DBus.ObjectManager"
ADAPTER_IFACE = "org.bluez.Adapter1"
DEVICES_IFACE = "org.bluez.Device1"
def main_loop():
devinfo = None
objects = None
dbussys = dbus.SystemBus()
dbusconnection = dbussys.get_object(BLUEZ_SERVICE_NAME, "/")
bluezInterface = dbus.Interface(dbusconnection, DBUS_OM_IFACE)
while True:
objects = bluezInterface.GetManagedObjects()
for path in objects:
name = objects[path].get(DEVICES_IFACE, {}).get('Name')
rssi = objects[path].get(DEVICES_IFACE, {}).get('RSSI')
service_data = objects[path].get(DEVICES_IFACE, {}).get('ServiceData')
if all((name, rssi, service_data)):
print(f'{name} # {rssi} = {service_data}')
#[... Do someting...]
if __name__ == '__main__':
main_loop()
I'm not sure what you are trying to do in the broader project but if I can make some recommendations...
A more typical way of scanning for service/manufacturer data is to subscribe to signals in D-Bus that trigger callbacks when something of interest happens.
Below is some code I use to look for iBeacons and Eddystone beacons. This runs using the GLib event loop which is maybe something you have ruled out but is more efficient on resources.
It does use different Python dbus bindings as I find pydbus more "pythonic".
I have left the code in processing the beacons as it might be a useful reference.
import argparse
from gi.repository import GLib
from pydbus import SystemBus
import uuid
DEVICE_INTERFACE = 'org.bluez.Device1'
remove_list = set()
def stop_scan():
"""Stop device discovery and quit event loop"""
adapter.StopDiscovery()
mainloop.quit()
def clean_beacons():
"""
BlueZ D-Bus API does not show duplicates. This is a
workaround that removes devices that have been found
during discovery
"""
not_found = set()
for rm_dev in remove_list:
try:
adapter.RemoveDevice(rm_dev)
except GLib.Error as err:
not_found.add(rm_dev)
for lost in not_found:
remove_list.remove(lost)
def process_eddystone(data):
"""Print Eddystone data in human readable format"""
_url_prefix_scheme = ['http://www.', 'https://www.',
'http://', 'https://', ]
_url_encoding = ['.com/', '.org/', '.edu/', '.net/', '.info/',
'.biz/', '.gov/', '.com', '.org', '.edu',
'.net', '.info', '.biz', '.gov']
tx_pwr = int.from_bytes([data[1]], 'big', signed=True)
# Eddystone UID Beacon format
if data[0] == 0x00:
namespace_id = int.from_bytes(data[2:12], 'big')
instance_id = int.from_bytes(data[12:18], 'big')
print(f'\t\tEddystone UID: {namespace_id} - {instance_id} \u2197 {tx_pwr}')
# Eddystone URL beacon format
elif data[0] == 0x10:
prefix = data[2]
encoded_url = data[3:]
full_url = _url_prefix_scheme[prefix]
for letter in encoded_url:
if letter < len(_url_encoding):
full_url += _url_encoding[letter]
else:
full_url += chr(letter)
print(f'\t\tEddystone URL: {full_url} \u2197 {tx_pwr}')
def process_ibeacon(data, beacon_type='iBeacon'):
"""Print iBeacon data in human readable format"""
print('DATA:', data)
beacon_uuid = uuid.UUID(bytes=bytes(data[2:18]))
major = int.from_bytes(bytearray(data[18:20]), 'big', signed=False)
minor = int.from_bytes(bytearray(data[20:22]), 'big', signed=False)
tx_pwr = int.from_bytes([data[22]], 'big', signed=True)
print(f'\t\t{beacon_type}: {beacon_uuid} - {major} - {minor} \u2197 {tx_pwr}')
def ble_16bit_match(uuid_16, srv_data):
"""Expand 16 bit UUID to full 128 bit UUID"""
uuid_128 = f'0000{uuid_16}-0000-1000-8000-00805f9b34fb'
return uuid_128 == list(srv_data.keys())[0]
def on_iface_added(owner, path, iface, signal, interfaces_and_properties):
"""
Event handler for D-Bus interface added.
Test to see if it is a new Bluetooth device
"""
iface_path, iface_props = interfaces_and_properties
if DEVICE_INTERFACE in iface_props:
on_device_found(iface_path, iface_props[DEVICE_INTERFACE])
def on_device_found(device_path, device_props):
"""
Handle new Bluetooth device being discover.
If it is a beacon of type iBeacon, Eddystone, AltBeacon
then process it
"""
address = device_props.get('Address')
address_type = device_props.get('AddressType')
name = device_props.get('Name')
alias = device_props.get('Alias')
paired = device_props.get('Paired')
trusted = device_props.get('Trusted')
rssi = device_props.get('RSSI')
service_data = device_props.get('ServiceData')
manufacturer_data = device_props.get('ManufacturerData')
if address.casefold() == '00:c3:f4:f1:58:69':
print('Found mac address of interest')
if service_data and ble_16bit_match('feaa', service_data):
process_eddystone(service_data['0000feaa-0000-1000-8000-00805f9b34fb'])
remove_list.add(device_path)
elif manufacturer_data:
for mfg_id in manufacturer_data:
# iBeacon 0x004c
if mfg_id == 0x004c and manufacturer_data[mfg_id][0] == 0x02:
process_ibeacon(manufacturer_data[mfg_id])
remove_list.add(device_path)
# AltBeacon 0xacbe
elif mfg_id == 0xffff and manufacturer_data[mfg_id][0:2] == [0xbe, 0xac]:
process_ibeacon(manufacturer_data[mfg_id], beacon_type='AltBeacon')
remove_list.add(device_path)
clean_beacons()
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-d', '--duration', type=int, default=0,
help='Duration of scan [0 for continuous]')
args = parser.parse_args()
bus = SystemBus()
adapter = bus.get('org.bluez', '/org/bluez/hci0')
bus.subscribe(iface='org.freedesktop.DBus.ObjectManager',
signal='InterfacesAdded',
signal_fired=on_iface_added)
mainloop = GLib.MainLoop()
if args.duration > 0:
GLib.timeout_add_seconds(args.duration, stop_scan)
adapter.SetDiscoveryFilter({'DuplicateData': GLib.Variant.new_boolean(False)})
adapter.StartDiscovery()
try:
print('\n\tUse CTRL-C to stop discovery\n')
mainloop.run()
except KeyboardInterrupt:
stop_scan()

Resource 7bed8adc-9ed9-49dc-b15e-6660e2fc3285 transitioned to failure state ERROR when use openstacksdk to create_server

When I create the openstack server, I get bellow Exception:
Resource 7bed8adc-9ed9-49dc-b15e-6660e2fc3285 transitioned to failure state ERROR
My code is bellow:
server_args = {
"name":server_name,
"image_id":image_id,
"flavor_id":flavor_id,
"networks":[{"uuid":network.id}],
"admin_password": admin_password,
}
try:
server = user_conn.conn.compute.create_server(**server_args)
server = user_conn.conn.compute.wait_for_server(server)
except Exception as e: # there I except the Exception
raise e
When create_server, my server_args data is bellow:
{'flavor_id': 'd4424892-4165-494e-bedc-71dc97a73202', 'networks': [{'uuid': 'da4e3433-2b21-42bb-befa-6e1e26808a99'}], 'admin_password': '123456', 'name': '133456', 'image_id': '60f4005e-5daf-4aef-a018-4c6b2ff06b40'}
My openstacksdk version is 0.9.18.
In the end, I find the flavor data is too big for openstack compute node, so I changed it to a small flavor, so I create success.

Internal error in the mapping processor: java.lang.NullPointerException

I'm trying to my map local pojo to an autogenerated domain objects using mapstruct. Expect for a specific complex structure everything else seems to map and the mapper implementation class gets generation. Below is the error that I get.
My mapper class is:
#Mappings({
#Mapping(source = "sourcefile", target = "sourceFILE"),
#Mapping(source = "id", target = "ID"),
#Mapping(source = "reg", target = "regID"),
#Mapping(source = "itemDetailsType", target = "ItemDetailsType") //This is the structure that does not map
})
AutoGenDomainType map(LocalPojo localPojo);
#Mappings({
#Mapping(source = "line", target = "LINE"),
#Mapping(source = type", target = "TYPE")
})
ItemDetailsType map(ItemDetailsTypes itemDetailsType);
Error:
Internal error in the mapping processor: java.lang.NullPointerException at org.mapstruct.ap.internal.processor.creation.MappingResolverImpl$ResolvingAttempt.hasCompatibleCopyConstructor(MappingResolverImpl.java:547) at org.mapstruct.ap.internal.processor.creation.MappingResolverImpl$ResolvingAttempt.isPropertyMappable(MappingResolverImpl.java:522) at org.mapstruct.ap.internal.processor.creation.MappingResolverImpl$ResolvingAttempt.getTargetAssignment(MappingResolverImpl.java:202) at org.mapstruct.ap.internal.processor.creation.MappingResolverImpl$ResolvingAttempt.access$100(MappingResolverImpl.java:153) at org.mapstruct.ap.internal.processor.creation.MappingResolverImpl.getTargetAssignment(MappingResolverImpl.java:121) at
.....
.....
[ERROR]
[ERROR] Found 1 error and 16 warnings.
[ERROR] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project uwo-services: Compilation failure
The target object ItemDetailsType does have other properties that need not be mapped. The error says compilation issue, but I dont find any. Also I have tried adding have tried the unmappedTargetPolicy = ReportingPolicy.IGNORE at my mapper class level just to avoid if this is caused by the unmapped properties, but still no solution.
This is a known bug in MapStruct. The bug is reported in #729, it has been fixed in 1.1.0.Final. You are using 1.0.0.Final. I would highly suggest switching to the either 1.1.0.Final or 1.2.0.Beta2.
Once you update you will see a better error message and you will know exactly what the problem in the mapping is.
By looking at this first it looks like that target in #Mapping(source = "itemDetailsType", target = "ItemDetailsType") is wrong. Are you sure that you need a capital letter there?

Spark using map in cluster mode

I have a immutable map in my class. When I run my code in local mode, there is no problem and I can reach every key in the map. However, when I run my code in cluster mode, nodes throw error about not finding the key in the map.
What I've tried up to now are these;
-Broadcast the immutable map over cluster.
broadcast = sc.broadcast(my_immutable_map)
-Parallelize the map as pair RDD
my_map_rdd = sc.parallelize( my_immutable_map.toSeq)
When i examine the logs, I see key not found exception.
My error stacktrace is as follows:
Driver stacktrace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 15.0 failed 4 times, most recent failure: Lost task 1.3 in stage 15.0 (TID 25, datanode1.big.com): java.util.NoSuchElementException: key not found: 905053199731
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at havelsan.CDRGenerator$.generate_random_target(CDRGenerator.scala:95)
at havelsan.CDRGenerator$$anonfun$main$2$$anonfun$6.apply(CDRGenerator.scala:167)
at havelsan.CDRGenerator$$anonfun$main$2$$anonfun$6.apply(CDRGenerator.scala:165)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:389)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply$mcV$sp(PairRDDFunctions.scala:1197)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1197)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$6.apply(PairRDDFunctions.scala:1197)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1251)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1205)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1185)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Can you explain how spark distribute maps and how it is possible that some nodes can't find some keys in this map, please? Btw my spark version is 1.6.0
What am I missing?
UPDATE
This part is for initializing the map on driver.
...
var pd = sc.textFile( "hdfs://...")
my_immutable_map = pd.map( line => line.split(":") ).map{ line => (line(0), line(1).split(","))}.collectAsMap
...
broadcast = sc.broadcast(my_immutable_map)
my_map_rdd = sc.parallelize( my_immutable_map.toSeq)
And this is the part where i got the error.
def my_func(key:String):String={
...
my_value = broadcast.value(key)
...
}
my_func is called inside a map as;
my_another_rdd.map{ line =>
val key = line.split(",")(0)
my_func(key)
}
The solution that i found is to pass the broadcast value to the function as a parameter. Still, I couldn't find a solution for parallelize method.
https://stackoverflow.com/a/34912887/4668959

Allocating memory in Flash for user data (STM32F4 HAL)

I'm trying to use the internal flash of an STM32F405 to store a bunch of user settable bytes that remain after rebooting.
I'm using:
uint8_t userConfig[64] __attribute__((at(0x0800C000)));
to allocate memory for the data I want to store.
When the program starts, I check to see if the first byte is set to 0x42, if not, i set it using:
HAL_FLASH_Unlock();
HAL_FLASH_Program(TYPEPROGRAM_BYTE, &userConfig[0], 0x42);
HAL_FLASH_Lock();
After that I check the value in userConfig[0] and I see 0x42... Great!
When I hit reset, however, and look at the location again, it's not 0x42 anymore...
Any idea where I'm going wrong? I've also tried:
#pragma location = 0x0800C00
volatile const uint8_t userConfig[64]
but I get the same result..
Okay I found an answer on the ST forums thanks to clive1. This example works for an STM32F405xG.
First we need to modify the memory layout in the linker script file (.ld file)
Modify the existing FLASH and add a new line for DATA. Here I've allocated all of section 11.
MEMORY
{
FLASH (RX) : ORIGIN = 0x08000000, LENGTH = 1M-128K
DATA (RWX) : ORIGIN = 0x080E0000, LENGTH = 128k
...
...
}
Manual for editing linker files on the sourceware website
In the same file, we need to add:
.user_data :
{
. = ALIGN(4);
*(.user_data)
. = ALIGN(4);
} > DATA
This creates a section called .user_data that we can address in the program code.
Finally, in your .c file add:
__attribute__((__section__(".user_data"))) const uint8_t userConfig[64]
This specifies that we wish to store the userConfig variable in the .user_data section and const makes sure the address of userConfig is kept static.
Now, to write to this area of flash during runtime, you can use the stm32f4 stdlib or HAL flash driver.
Before you can write to the flash, it has to be erased (all bytes set to 0xFF) The instructions for the HAL library say nothing about doing this for some reason...
HAL_FLASH_Unlock();
__HAL_FLASH_CLEAR_FLAG(FLASH_FLAG_EOP | FLASH_FLAG_OPERR | FLASH_FLAG_WRPERR | FLASH_FLAG_PGAERR | FLASH_FLAG_PGSERR );
FLASH_Erase_Sector(FLASH_SECTOR_11, VOLTAGE_RANGE_3);
HAL_FLASH_Program(TYPEPROGRAM_WORD, &userConfig[index], someData);
HAL_FLASH_Lock();

Resources