How do I make big.mark apply to more than just the first column? - r

Here's what I'm getting:
> panderOptions('big.mark', ',')
> foo <- rbind(cancer, cancer); for(i in 1:8) foo <- rbind(foo, foo)
> pander(table(foo$ph.karno, foo$pat.karno))
-----------------------------------------------------
30 40 50 60 70 80 90 100
--------- ----- ---- ---- ---- ---- ----- ----- -----
**50** 0 512 512 512 512 512 0 0
**60** 0 512 512 2560 4608 1536 0 0
**70** 1,024 0 1024 4608 3072 3072 1536 1536
**80** 0 0 0 6144 6656 6656 10240 4608
**90** 0 0 0 1536 5120 10752 12288 7680
**100** 0 0 0 0 512 3584 6656 4096
-----------------------------------------------------
I would like the comma delimiter to show up in the other columns too. How do I do that?

Best result I got (with t being your table call result):
> pander(format(t,big.mark=','))
----------------------------------------------------------
30 40 50 60 70 80 90 100
--------- ----- ---- ----- ----- ----- ------ ------ -----
**50** 0 512 512 512 512 512 0 0
**60** 0 512 512 2,560 4,608 1,536 0 0
**70** 1,024 0 1,024 4,608 3,072 3,072 1,536 1,536
**80** 0 0 0 6,144 6,656 6,656 10,240 4,608
**90** 0 0 0 1,536 5,120 10,752 12,288 7,680
**100** 0 0 0 0 512 3,584 6,656 4,096
----------------------------------------------------------
I assume this is a bug in pander.table.return but I did not dig enough to get the root cause.
Edit: I found the reason, line 283 of the function there's a for loop calling sapply on each col, but once the first column has been processed, the full array of the table is converted to char as we bring in chars from the output of format.
Then the subsequent calls to format can't format the number as they've been coerced to char.
Code from pander.table.return giving this behavior:
for (j in 1:ncol(t)) {
temp.t[, j] <- sapply(temp.t[, j], format, trim = TRUE,
digits = digits[j], decimal.mark = decimal.mark,
big.mark = big.mark)
}

Related

Slow query in MariaDB-10.5.15, fetching time, with a simple select

I'm having a problem with a slow query in MariaDB, which is driving me crazy, and it's X-file worthy.
I have a simple table, which I show you below, with only 10 rows, and the time it takes to return a simple select * from the table is more than 180 seconds !!!!
I provide the following data to see if you see something strange and you can help me:
a) The structure of the table, to highlight the mediumblog field where an image is saved.
b) mysql slow_query output.
c) Global variables.
a) Table structure
CREATE TABLE `clientespendientes` (
`numeroid` varchar(40) NOT NULL DEFAULT '' COMMENT 'Número de DNI o pasaporte del cliente',
`direccionpublica` blob NOT NULL COMMENT 'Dirección pública del cliente',
`email` varchar(101) NOT NULL DEFAULT '' COMMENT 'Email del usuario',
`documento` mediumblob NOT NULL COMMENT 'DNI o Pasaporte del Cliente',
`fecha` datetime NOT NULL DEFAULT '0000-00-00 00:00:00' COMMENT 'Momento en el que creó la petición de verificación',
`nacionalidad` tinyint(4) NOT NULL DEFAULT 0 COMMENT 'Nacionalidad Cliente',
`idioma` tinyint(3) unsigned NOT NULL DEFAULT 0 COMMENT 'Idioma de cara al envío de mails',
`ip` varchar(16) DEFAULT NULL COMMENT 'Ip desde la que se dio de alta el cliente',
`nombrefichero` varchar(30) DEFAULT '' COMMENT 'Nombre del fichero con el que debemos devolver la clave pública',
PRIMARY KEY (`numeroid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Clientes pendientes de verificar'
b) mysql slow query output
Time: 220719 23:46:08
User#Host: cabe[cabe] # [217.61.225.227]
Thread_id: 33 Schema: litra QC_hit: No
**Query_time: 179.691794** Lock_time: 0.000120 Rows_sent: 10 Rows_examined: 10
Rows_affected: 0 Bytes_sent: 5504314
Full_scan: Yes Full_join: No Tmp_table: No Tmp_table_on_disk: No
Filesort: No Filesort_on_disk: No Merge_passes: 0 Priority_queue: No
explain: id select_type table type possible_keys key key_len ref rows r_rows filtered r_filtered Extra
explain: 1 SIMPLE clientespendientes ALL NULL NULL NULL NULL 10 10.00 100.00 100.00
SET timestamp=1658274368;
SELECT * FROM litra.clientespendientes
LIMIT 0, 1000;
c) Global variables.
analyze_sample_percentage 100.000000
aria_block_size 8192
aria_checkpoint_interval 30
aria_checkpoint_log_activity 1048576
aria_encrypt_tables OFF
aria_force_start_after_recovery_failures 0
aria_group_commit none
aria_group_commit_interval 0
aria_log_file_size 1073741824
aria_log_purge_type immediate
aria_max_sort_file_size 9223372036853727232
aria_page_checksum ON
aria_pagecache_age_threshold 300
aria_pagecache_buffer_size 134217728
aria_pagecache_division_limit 100
aria_pagecache_file_hash_size 512
aria_recover_options BACKUP,QUICK
aria_repair_threads 1
aria_sort_buffer_size 268434432
aria_stats_method nulls_unequal
aria_sync_log_dir NEWFILE
aria_used_for_temp_tables ON
auto_increment_increment 1
auto_increment_offset 1
autocommit ON
automatic_sp_privileges ON
back_log 80
basedir /usr
big_tables OFF
bind_address 0.0.0.0
binlog_annotate_row_events ON
binlog_cache_size 32768
binlog_checksum CRC32
binlog_commit_wait_count 0
binlog_commit_wait_usec 100000
binlog_direct_non_transactional_updates OFF
binlog_file_cache_size 16384
binlog_format MIXED
binlog_optimize_thread_scheduling ON
binlog_row_image FULL
binlog_row_metadata NO_LOG
binlog_stmt_cache_size 32768
bulk_insert_buffer_size 8388608
character_set_client utf8
character_set_connection utf8
character_set_database utf8mb4
character_set_filesystem binary
character_set_results utf8
character_set_server utf8mb4
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/
check_constraint_checks ON
collation_connection utf8_general_ci
collation_database utf8mb4_general_ci
collation_server utf8mb4_general_ci
column_compression_threshold 100
column_compression_zlib_level 6
column_compression_zlib_strategy DEFAULT_STRATEGY
column_compression_zlib_wrap OFF
completion_type NO_CHAIN
concurrent_insert AUTO
connect_timeout 10
core_file OFF
datadir /var/lib/mysql/
date_format %Y-%m-%d
datetime_format %Y-%m-%d %H:%i:%s
deadlock_search_depth_long 15
deadlock_search_depth_short 4
deadlock_timeout_long 50000000
deadlock_timeout_short 10000
debug_no_thread_alarm OFF
default_master_connection
default_password_lifetime 0
default_regex_flags
default_storage_engine InnoDB
default_tmp_storage_engine
default_week_format 0
delay_key_write ON
delayed_insert_limit 100
delayed_insert_timeout 300
delayed_queue_size 1000
disconnect_on_expired_password OFF
div_precision_increment 4
encrypt_binlog OFF
encrypt_tmp_disk_tables OFF
encrypt_tmp_files OFF
enforce_storage_engine
eq_range_index_dive_limit 200
error_count 0
event_scheduler OFF
expensive_subquery_limit 100
expire_logs_days 10
explicit_defaults_for_timestamp OFF
external_user
extra_max_connections 1
extra_port 0
flush OFF
flush_time 0
foreign_key_checks ON
ft_boolean_syntax + -><()~*:""&|
ft_max_word_len 84
ft_min_word_len 4
ft_query_expansion_limit 20
ft_stopword_file (built-in)
general_log ON
general_log_file /var/log/mysql/mysql.log
group_concat_max_len 1048576
gtid_binlog_pos
gtid_binlog_state
gtid_cleanup_batch_size 64
gtid_current_pos
gtid_domain_id 0
gtid_ignore_duplicates OFF
gtid_pos_auto_engines
gtid_seq_no 0
gtid_slave_pos
gtid_strict_mode OFF
have_compress YES
have_crypt YES
have_dynamic_loading YES
have_geometry YES
have_openssl YES
have_profiling YES
have_query_cache YES
have_rtree_keys YES
have_ssl DISABLED
have_symlink YES
histogram_size 254
histogram_type DOUBLE_PREC_HB
host_cache_size 279
hostname Litra0
identity 0
idle_readonly_transaction_timeout 0
idle_transaction_timeout 0
idle_write_transaction_timeout 0
ignore_builtin_innodb OFF
ignore_db_dirs
in_predicate_conversion_threshold 1000
in_transaction 0
init_connect
init_file
init_slave
innodb_adaptive_flushing ON
innodb_adaptive_flushing_lwm 10.000000
innodb_adaptive_hash_index OFF
innodb_adaptive_hash_index_parts 8
innodb_adaptive_max_sleep_delay 0
innodb_autoextend_increment 64
innodb_autoinc_lock_mode 1
innodb_background_scrub_data_check_interval 0
innodb_background_scrub_data_compressed OFF
innodb_background_scrub_data_interval 0
innodb_background_scrub_data_uncompressed OFF
innodb_buf_dump_status_frequency 0
innodb_buffer_pool_chunk_size 134217728
innodb_buffer_pool_dump_at_shutdown ON
innodb_buffer_pool_dump_now OFF
innodb_buffer_pool_dump_pct 25
innodb_buffer_pool_filename ib_buffer_pool
innodb_buffer_pool_instances 1
innodb_buffer_pool_load_abort OFF
innodb_buffer_pool_load_at_startup ON
innodb_buffer_pool_load_now OFF
innodb_buffer_pool_size 1073741824
innodb_change_buffer_max_size 25
innodb_change_buffering none
innodb_checksum_algorithm full_crc32
innodb_cmp_per_index_enabled OFF
innodb_commit_concurrency 0
innodb_compression_algorithm zlib
innodb_compression_default OFF
innodb_compression_failure_threshold_pct 5
innodb_compression_level 6
innodb_compression_pad_pct_max 50
innodb_concurrency_tickets 0
innodb_data_file_path ibdata1:12M:autoextend
innodb_data_home_dir
innodb_deadlock_detect ON
innodb_default_encryption_key_id 1
innodb_default_row_format dynamic
innodb_defragment OFF
innodb_defragment_fill_factor 0.900000
innodb_defragment_fill_factor_n_recs 20
innodb_defragment_frequency 40
innodb_defragment_n_pages 7
innodb_defragment_stats_accuracy 0
innodb_disable_sort_file_cache OFF
innodb_disallow_writes OFF
innodb_doublewrite ON
innodb_encrypt_log OFF
innodb_encrypt_tables OFF
innodb_encrypt_temporary_tables OFF
innodb_encryption_rotate_key_age 1
innodb_encryption_rotation_iops 100
innodb_encryption_threads 0
innodb_fast_shutdown 1
innodb_fatal_semaphore_wait_threshold 600
innodb_file_format
innodb_file_per_table ON
innodb_fill_factor 100
innodb_flush_log_at_timeout 1
innodb_flush_log_at_trx_commit 1
innodb_flush_method fsync
innodb_flush_neighbors 1
innodb_flush_sync ON
innodb_flushing_avg_loops 30
innodb_force_load_corrupted OFF
innodb_force_primary_key OFF
innodb_force_recovery 0
innodb_ft_aux_table
innodb_ft_cache_size 8000000
innodb_ft_enable_diag_print OFF
innodb_ft_enable_stopword ON
innodb_ft_max_token_size 84
innodb_ft_min_token_size 3
innodb_ft_num_word_optimize 2000
innodb_ft_result_cache_limit 2000000000
innodb_ft_server_stopword_table
innodb_ft_sort_pll_degree 2
innodb_ft_total_cache_size 640000000
innodb_ft_user_stopword_table
innodb_immediate_scrub_data_uncompressed OFF
innodb_instant_alter_column_allowed add_drop_reorder
innodb_io_capacity 200
innodb_io_capacity_max 2000
innodb_large_prefix
innodb_lock_schedule_algorithm fcfs
innodb_lock_wait_timeout 50
innodb_lock_wait_timeout 50
innodb_log_buffer_size 16777216
innodb_log_checksums ON
innodb_log_compressed_pages ON
innodb_log_file_size 100663296
innodb_log_files_in_group 1
innodb_log_group_home_dir ./
innodb_log_optimize_ddl OFF
innodb_log_write_ahead_size 8192
innodb_lru_flush_size 32
innodb_lru_scan_depth 1536
innodb_max_dirty_pages_pct 90.000000
innodb_max_dirty_pages_pct_lwm 0.000000
innodb_max_purge_lag 0
innodb_max_purge_lag_delay 0
innodb_max_purge_lag_wait 4294967295
innodb_max_undo_log_size 10485760
innodb_monitor_disable
innodb_monitor_enable
innodb_monitor_reset
innodb_monitor_reset_all
innodb_old_blocks_pct 37
innodb_old_blocks_time 1000
innodb_online_alter_log_max_size 134217728
innodb_open_files 2000
innodb_optimize_fulltext_only OFF
innodb_page_cleaners 1
innodb_page_size 16384
innodb_prefix_index_cluster_optimization OFF
innodb_print_all_deadlocks OFF
innodb_purge_batch_size 300
innodb_purge_rseg_truncate_frequency 128
innodb_purge_threads 4
innodb_random_read_ahead OFF
innodb_read_ahead_threshold 56
innodb_read_io_threads 4
innodb_read_only OFF
innodb_replication_delay 0
innodb_rollback_on_timeout OFF
innodb_scrub_log OFF
innodb_scrub_log_speed 256
innodb_sort_buffer_size 1048576
innodb_spin_wait_delay 4
innodb_stats_auto_recalc ON
innodb_stats_include_delete_marked OFF
innodb_stats_method nulls_equal
innodb_stats_modified_counter 0
innodb_stats_on_metadata OFF
innodb_stats_persistent ON
innodb_stats_persistent_sample_pages 20
innodb_stats_traditional ON
innodb_stats_transient_sample_pages 8
innodb_status_output OFF
innodb_status_output_locks OFF
innodb_strict_mode ON
innodb_sync_array_size 1
innodb_sync_spin_loops 30
innodb_table_locks ON
innodb_temp_data_file_path ibtmp1:12M:autoextend
innodb_thread_concurrency 0
innodb_thread_sleep_delay 0
innodb_tmpdir
innodb_undo_directory ./
innodb_undo_log_truncate OFF
innodb_undo_logs 128
innodb_undo_tablespaces 0
innodb_use_atomic_writes ON
innodb_use_native_aio ON
innodb_version 10.5.15
innodb_write_io_threads 4
insert_id 0
interactive_timeout 28800
join_buffer_size 262144
join_buffer_space_limit 2097152
join_cache_level 2
keep_files_on_create OFF
key_buffer_size 268435456
key_cache_age_threshold 300
key_cache_block_size 1024
key_cache_division_limit 100
key_cache_file_hash_size 512
key_cache_segments 0
large_files_support ON
large_page_size 0
large_pages OFF
last_gtid
last_insert_id 0
lc_messages en_US
lc_messages_dir /usr/share/mysql
lc_time_names en_US
license GPL
local_infile ON
lock_wait_timeout 86400
locked_in_memory OFF
log_bin OFF
log_bin_basename
log_bin_compress OFF
log_bin_compress_min_len 256
log_bin_index
log_bin_trust_function_creators OFF
log_disabled_statements sp
log_error /var/log/mysql/error.log
log_output FILE
log_queries_not_using_indexes OFF
log_slave_updates OFF
log_slow_admin_statements ON
log_slow_disabled_statements sp
log_slow_filter admin,filesort,filesort_on_disk,filesort_priority_queue,full_join,full_scan,query_cache,query_cache_miss,tmp_table,tmp_table_on_disk
log_slow_rate_limit 1
log_slow_slave_statements ON
log_slow_verbosity query_plan,explain
log_tc_size 24576
log_warnings 2
long_query_time 10.000000
low_priority_updates OFF
lower_case_file_system OFF
lower_case_table_names 0
master_verify_checksum OFF
max_allowed_packet 1073741824
max_binlog_cache_size 18446744073709547520
max_binlog_size 1073741824
max_binlog_stmt_cache_size 18446744073709547520
max_connect_errors 100
max_connections 151
max_delayed_threads 20
max_digest_length 1024
max_error_count 64
max_heap_table_size 16777216
max_insert_delayed_threads 20
max_join_size 18446744073709551615
max_length_for_sort_data 1024
max_password_errors 4294967295
max_prepared_stmt_count 16382
max_recursive_iterations 4294967295
max_relay_log_size 1073741824
max_rowid_filter_size 131072
max_seeks_for_key 4294967295
max_session_mem_used 9223372036854775807
max_rowid_filter_size 131072
max_seeks_for_key 4294967295
max_session_mem_used 9223372036854775807
max_sort_length 1024
max_sp_recursion_depth 0
max_statement_time 0.000000
max_tmp_tables 32
max_user_connections 0
max_write_lock_count 4294967295
metadata_locks_cache_size 1024
metadata_locks_hash_instances 8
min_examined_row_limit 0
mrr_buffer_size 262144
myisam_block_size 1024
myisam_data_pointer_size 6
myisam_max_sort_file_size 9223372036853727232
myisam_mmap_size 18446744073709551615
myisam_recover_options BACKUP,QUICK
myisam_repair_threads 1
myisam_sort_buffer_size 134216704
myisam_stats_method NULLS_UNEQUAL
myisam_use_mmap OFF
mysql56_temporal_format ON
net_buffer_length 16384
net_read_timeout 30
net_retry_count 10
net_write_timeout 60
old OFF
old_alter_table DEFAULT
old_mode
old_passwords OFF
open_files_limit 32194
optimizer_max_sel_arg_weight 32000
optimizer_prune_level 1
optimizer_search_depth 62
optimizer_selectivity_sampling_limit 100
optimizer_switch index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_cond>
optimizer_trace enabled=off
optimizer_trace_max_mem_size 1048576
optimizer_use_condition_selectivity 4
performance_schema OFF
performance_schema_accounts_size -1
performance_schema_digests_size -1
performance_schema_events_stages_history_long_size -1
performance_schema_events_stages_history_size -1
performance_schema_events_statements_history_long_size -1
performance_schema_events_statements_history_long_size -1
performance_schema_events_statements_history_size -1
performance_schema_events_transactions_history_long_size -1
performance_schema_events_transactions_history_size -1
performance_schema_events_waits_history_long_size -1
performance_schema_events_waits_history_size -1
performance_schema_hosts_size -1
performance_schema_max_cond_classes 90
performance_schema_max_cond_instances -1
performance_schema_max_digest_length 1024
performance_schema_max_file_classes 80
performance_schema_max_file_handles 32768
performance_schema_max_file_instances -1
performance_schema_max_index_stat -1
performance_schema_max_memory_classes 320
performance_schema_max_metadata_locks -1
performance_schema_max_mutex_classes 210
performance_schema_max_mutex_instances -1
performance_schema_max_prepared_statements_instances -1
performance_schema_max_program_instances -1
performance_schema_max_rwlock_classes 50
performance_schema_max_rwlock_instances -1
performance_schema_max_socket_classes 10
performance_schema_max_socket_instances -1
performance_schema_max_sql_text_length 1024
performance_schema_max_stage_classes 160
performance_schema_max_statement_classes 222
performance_schema_max_statement_stack 10
performance_schema_max_table_handles -1
performance_schema_max_table_instances -1
performance_schema_max_table_lock_stat -1
performance_schema_max_thread_classes 50
performance_schema_max_thread_instances -1
performance_schema_session_connect_attrs_size -1
performance_schema_setup_actors_size -1
performance_schema_setup_objects_size -1
performance_schema_users_size -1
pid_file /run/mysqld/mysqld.pid
plugin_dir /usr/lib/mysql/plugin/
plugin_maturity gamma
port 3306
preload_buffer_size 32768
profiling OFF
profiling_history_size 15
progress_report_time 5
protocol_version 10
proxy_protocol_networks
proxy_user
pseudo_slave_mode OFF
pseudo_thread_id 37
query_alloc_block_size 16384
query_cache_limit 1048576
query_cache_min_res_unit 4096
query_cache_size 1048576
query_cache_strip_comments OFF
query_cache_type OFF
query_cache_wlock_invalidate OFF
query_prealloc_size 24576
rand_seed1 338934328
rand_seed2 991539275
range_alloc_block_size 4096
read_binlog_speed_limit 0
read_buffer_size 131072
read_only OFF
read_rnd_buffer_size 262144
relay_log
relay_log_basename
relay_log_index
relay_log_info_file relay-log.info
relay_log_purge ON
relay_log_recovery OFF
relay_log_space_limit 0
replicate_annotate_row_events ON
replicate_do_db
replicate_do_table
replicate_events_marked_for_skip REPLICATE
replicate_ignore_db
replicate_ignore_table
replicate_wild_do_table
replicate_wild_ignore_table
report_host
report_password
report_port 3306
report_user
require_secure_transport OFF
rowid_merge_buff_size 8388608
rpl_semi_sync_master_enabled OFF
rpl_semi_sync_master_timeout 10000
rpl_semi_sync_master_trace_level 32
rpl_semi_sync_master_wait_no_slave ON
rpl_semi_sync_master_wait_point AFTER_COMMIT
rpl_semi_sync_master_wait_point AFTER_COMMIT
rpl_semi_sync_slave_delay_master OFF
rpl_semi_sync_slave_enabled OFF
rpl_semi_sync_slave_kill_conn_timeout 5
rpl_semi_sync_slave_trace_level 32
secure_auth ON
secure_file_priv
secure_timestamp NO
server_id 1
session_track_schema ON
session_track_state_change OFF
session_track_system_variables autocommit,character_set_client,character_set_connection,character_set_results,time_zone
session_track_transaction_info OFF
skip_external_locking ON
skip_name_resolve OFF
skip_networking OFF
skip_parallel_replication OFF
skip_replication OFF
skip_show_database OFF
slave_compressed_protocol OFF
slave_ddl_exec_mode IDEMPOTENT
slave_domain_parallel_threads 0
slave_exec_mode STRICT
slave_load_tmpdir /tmp
slave_max_allowed_packet 1073741824
slave_net_timeout 60
slave_parallel_max_queued 131072
slave_parallel_mode optimistic
slave_parallel_threads 0
slave_parallel_workers 0
slave_run_triggers_for_rbr NO
slave_skip_errors OFF
slave_sql_verify_checksum ON
slave_transaction_retries 10
slave_transaction_retry_errors 1158,1159,1160,1161,1205,1213,1429,2013,12701
slave_transaction_retry_interval 0
slave_type_conversions
slow_launch_time 2
slow_query_log ON
slow_query_log_file /var/log/mysql/mariadb-slow.log
socket /run/mysqld/mysqld.sock
sort_buffer_size 2097152
sql_auto_is_null OFF
sql_big_selects ON
sql_buffer_result OFF
sql_if_exists OFF
sql_if_exists OFF
sql_log_bin ON
sql_log_off OFF
sql_mode STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
sql_notes ON
sql_quote_show_create ON
sql_safe_updates OFF
sql_select_limit 18446744073709551615
sql_slave_skip_counter 0
sql_warnings OFF
ssl_ca
ssl_capath
ssl_cert
ssl_cipher
ssl_crl
ssl_crlpath
ssl_key
standard_compliant_cte ON
storage_engine InnoDB
stored_program_cache 256
strict_password_validation ON
sync_binlog 0
sync_frm ON
sync_master_info 10000
sync_relay_log 10000
sync_relay_log_info 10000
system_time_zone UTC
system_versioning_alter_history ERROR
system_versioning_asof DEFAULT
table_definition_cache 400
table_open_cache 2000
table_open_cache_instances 8
tcp_keepalive_interval 0
tcp_keepalive_probes 0
tcp_keepalive_time 0
tcp_nodelay ON
thread_cache_size 151
thread_handling one-thread-per-connection
thread_pool_dedicated_listener OFF
thread_pool_exact_stats OFF
thread_pool_idle_timeout 60
thread_pool_max_threads 65536
thread_pool_oversubscribe 3
thread_pool_prio_kickup_timer 1000
thread_pool_priority auto
thread_pool_prio_kickup_timer 1000
thread_pool_priority auto
thread_pool_size 12
thread_pool_stall_limit 500
thread_stack 196608
time_format %H:%i:%s
time_zone SYSTEM
timestamp 1658275075.187231
tls_version TLSv1.1,TLSv1.2,TLSv1.3
tmp_disk_table_size 18446744073709551615
tmp_memory_table_size 16777216
tmp_table_size 16777216
tmpdir /tmp
transaction_alloc_block_size 8192
transaction_prealloc_size 4096
tx_isolation REPEATABLE-READ
tx_read_only OFF
unique_checks ON
updatable_views_with_limit YES
use_stat_tables PREFERABLY_FOR_QUERIES
userstat OFF
version 10.5.15-MariaDB-0+deb11u1-log
version_comment Debian 11
version_compile_machine x86_64
version_compile_os debian-linux-gnu
version_malloc_library system
version_source_revision 9aa3564e8a06c3d2027fc514213ecf42b049b06e
version_ssl_library OpenSSL 1.1.1n 15 Mar 2022
wait_timeout 28800
warning_count 0
wsrep_osu_method TOI
wsrep_sr_store table
wsrep_auto_increment_control ON
wsrep_causal_reads OFF
wsrep_certification_rules strict
wsrep_certify_nonpk ON
wsrep_cluster_address
wsrep_cluster_name my_wsrep_cluster
wsrep_convert_lock_to_trx OFF
wsrep_data_home_dir /var/lib/mysql/
wsrep_dbug_option
wsrep_debug NONE
wsrep_desync OFF
wsrep_dirty_reads OFF
wsrep_drupal_282555_workaround OFF
wsrep_forced_binlog_format NONE
wsrep_gtid_domain_id 0
wsrep_gtid_mode OFF
wsrep_gtid_seq_no 0
wsrep_ignore_apply_errors 7
wsrep_load_data_splitting OFF
wsrep_log_conflicts OFF
wsrep_max_ws_rows 0
wsrep_max_ws_size 2147483647
wsrep_mysql_replication_bundle 0
wsrep_node_address
wsrep_node_incoming_address AUTO
wsrep_node_name Litra0
wsrep_notify_cmd
wsrep_on OFF
wsrep_patch_version wsrep_26.22
wsrep_provider none
wsrep_provider_options
wsrep_recover OFF
wsrep_reject_queries NONE
wsrep_replicate_myisam OFF
wsrep_restart_slave OFF
wsrep_retry_autocommit 1
wsrep_slave_fk_checks ON
wsrep_slave_uk_checks OFF
wsrep_slave_threads 1
wsrep_sst_auth
wsrep_sst_donor
wsrep_sst_donor_rejects_queries OFF
wsrep_sst_method rsync
wsrep_sst_receive_address AUTO
wsrep_start_position 00000000-0000-0000-0000-000000000000:-1
wsrep_strict_ddl OFF
wsrep_sync_wait 0
wsrep_trx_fragment_size 0
wsrep_trx_fragment_unit bytes

Nested xtab tables

I would like to produce nested tables for a multilevel factorial experiment. I have 10 paints examined for time to reach an end point under 4 levels of humidity, 3 temperatures and 2 wind speeds. Of course I have searched on line but without success.
Some sample code can be generated using:
## Made Up Data # NB the data is continuous whereas observations were made 40/168 so data is censored.
time3 <- 4*seq(1:24) # Dependent: times in hrs, runif is not really representative but will do
wind <- c(1,2) # Independent: factor draught on or off
RH <- c(0,35,75,95) # Independent: value for RH but can be processes as a factor
temp <- c(5,11,20) # Independent: value for temperature but can be processed as a factor
paint <- c("paintA", "paintB", "paintC") # Independent: Experimental material
# Combine into dataframe
dfa <- data.frame(rep(temp,8))
dfa$RH <- rep(RH,6)
dfa$wind <- rep(wind,12)
dfa$time3 <- time3
dfa$paint <- rep(paint[1],24)
# Replicate for different paints
dfb <- dfa
dfb$paint <- paint[2]
dfc <- dfa
dfc$paint <- paint[3]
dfx <- do.call("rbind", list(dfa,dfb,dfc))
# Rename first col
colnames(dfx)[1] <- "temp"
# Prepare xtab tables
tx <- xtabs(dfx$time3 ~ dfx$wind + dfx$RH + dfx$temp + dfx$paint)
tx
And the target I hope to obtain would be like this xtab example
This
tx <- xtabs(dfx$time3 ~ dfx$wind + dfx$RH + dfx$temp)
does not work well enough. I would also like to write to C:\file.csv for printing and reporting etc. Please advise on how to achieve the desired output.
You can paste the two variables you want to nest together. Since the items will be ordered lexicographically, you will need to zero-pad the temp variable, to get numerical ordering.
xtabs(time3~wind+paste(sprintf("%02d",temp),RH,sep=":")+paint,dfx)
, , paint = paintA
paste(sprintf("%02d", temp), RH, sep = ":")
wind 05:0 05:35 05:75 05:95 11:0 11:35 11:75 11:95 20:0 20:35 20:75 20:95
1 56 0 104 0 88 0 136 0 120 0 72 0
2 0 128 0 80 0 64 0 112 0 96 0 144
, , paint = paintB
paste(sprintf("%02d", temp), RH, sep = ":")
wind 05:0 05:35 05:75 05:95 11:0 11:35 11:75 11:95 20:0 20:35 20:75 20:95
1 56 0 104 0 88 0 136 0 120 0 72 0
2 0 128 0 80 0 64 0 112 0 96 0 144
, , paint = paintC
paste(sprintf("%02d", temp), RH, sep = ":")
wind 05:0 05:35 05:75 05:95 11:0 11:35 11:75 11:95 20:0 20:35 20:75 20:95
1 56 0 104 0 88 0 136 0 120 0 72 0
2 0 128 0 80 0 64 0 112 0 96 0 144

Efficient (repeat) looping

I am trying to evaluate if a price, price(k), in a given row,(k), is equal to the one above, price(k-1). If it is I want to sum the volume from the prior and the price in question, volume(k)+volume(k+1), and then remove the row with the duplicate price, row k.
I have the following repeat loop which I am applying to a large dataset looking to delete repeated values.
k <- 1
repeat{
if( Prices$Price[ k + 1 ] == Prices$Price[ k ] ){
Prices$CumVolume[ k + 1 ] <- Prices$CumVolume[ k + 1 ] + Prices$CumVolume[ k ]
Prices <- Prices[ -k , ]
k <- k + 1
if( k > nrow( Prices ) ) break
}
}
The loop is very slow and I was wondering if there are ways to speed it up. Unfortunately I am relatively new to R and am having difficulty working out the best way to go about this.
Also is there a way in R to observe the iteration the loop is currently up too? i.e. have it displayed in the workspace on each iteration?
Example data:
Date Time Price CumVolume Ret MeanRet VolRet
26 01-JAN-2009 21:30:01.783 96.660 537 0 0 0
31 01-JAN-2009 21:30:58.041 96.650 78 0 0 0
33 01-JAN-2009 21:34:09.589 96.640 60 0 0 0
35 01-JAN-2009 21:34:10.879 96.640 40 0 0 0
37 01-JAN-2009 21:35:55.001 96.635 50 0 0 0
It appears you want something like this:
DF <- read.table(text=" Date Time Price CumVolume Ret MeanRet VolRet
26 01-JAN-2009 21:30:01.783 96.660 537 0 0 0
31 01-JAN-2009 21:30:58.041 96.650 78 0 0 0
33 01-JAN-2009 21:34:09.589 96.640 60 0 0 0
35 01-JAN-2009 21:34:10.879 96.640 40 0 0 0
37 01-JAN-2009 21:35:55.001 96.635 50 0 0 0", header=TRUE)
#create a run id
DF$runs <- cumsum(c(TRUE, diff(DF$Price) != 0))
#sum per each price run
DF$CCVolume <- with(DF, ave(CumVolume, runs, FUN=sum))
#remove duplicated prices
DF[!duplicated(DF$Price), ]
# Date Time Price CumVolume Ret MeanRet VolRet runs CCVolume
#26 01-JAN-2009 21:30:01.783 96.660 537 0 0 0 1 537
#31 01-JAN-2009 21:30:58.041 96.650 78 0 0 0 2 78
#33 01-JAN-2009 21:34:09.589 96.640 60 0 0 0 3 100
#37 01-JAN-2009 21:35:55.001 96.635 50 0 0 0 4 50
I think your code is going in infinite loop because of your increment index.K=k+1 and Break is always within the condition,I hope you want this
k=1
z=unique(Prices$Price)
for(i in 1:length(z))
{
dupindex=which(z[i]==Prices$Price)
Prices$CumVolume[tail(dupindex,n=1)]=sum(Prices$CumVolume[dupindex])
Prices=Prices[-(dupindex[1:length(dupindex)-1]),]
}
I hope it help,thanks.

mistake in multivePenal but not in frailtyPenal

The libraries used are: library(survival)
library(splines)
library(boot)
library(frailtypack) and the function used is in the library frailty pack.
In my data I have two recurrent events(delta.stable and delta.unstable) and one terminal event (delta.censor). There are some time-varying explanatory variables, like unemployment rate(u.rate) (is quarterly) that's why my dataset has been splitted by quarters.
Here there is a link to the subsample used in the code just below, just in case it may be helpful to see the mistake. https://www.dropbox.com/s/spfywobydr94bml/cr_05_males_services.rda
The problem is that it takes a lot of time running until the warning message appear.
Main variables of the Survival function are:
I have two recurrent events:
delta.unstable (unst.): takes value one when the individual find an unstable job.
delta.stable (stable): takes value one when the individual find a stable job.
And one terminal event
delta.censor (d.censor): takes value one when the individual has death, retired or emigrated.
row id contadorbis unst. stable d.censor .t0 .t
1 78 1 0 1 0 0 88
2 101 2 0 1 0 0 46
3 155 3 0 1 0 0 27
4 170 4 0 0 0 0 61
5 170 4 1 0 0 61 86
6 213 5 0 0 0 0 92
7 213 5 0 0 0 92 182
8 213 5 0 0 0 182 273
9 213 5 0 0 0 273 365
10 213 5 1 0 0 365 394
11 334 6 0 1 0 0 6
12 334 7 1 0 0 0 38
13 369 8 0 0 0 0 27
14 369 8 0 0 0 27 119
15 369 8 0 0 0 119 209
16 369 8 0 0 0 209 300
17 369 8 0 0 0 300 392
When I apply multivePenal I obtain the following message:
Error en aggregate.data.frame(as.data.frame(x), ...) :
arguments must have same length
Además: Mensajes de aviso perdidos
In Surv(.t0, .t, delta.stable) : Stop time must be > start time, NA created
#### multivePenal function
fit.joint.05_malesP<multivePenal(Surv(.t0,.t,delta.stable)~cluster(contadorbis)+terminal(as.factor(delta.censor))+event2(delta.unstable),formula.terminalEvent=~1, formula2=~as.factor(h.skill),data=cr_05_males_serv,Frailty=TRUE,recurrentAG=TRUE,cross.validation=F,n.knots=c(7,7,7), kappa=c(1,1,1), maxit=1000, hazard="Splines")
I have checked if Surv(.t0,.t,delta.stable) contains NA, and there are no NA's.
In addition, when I apply for the same data the function frailtyPenal for both possible combinations, the function run well and I get results. I take one week looking at this and I do not find the key. I would appreciate some of light to this problem.
#delta unstable+death
enter code here
fit.joint.05_males<-frailtyPenal(Surv(.t0,.t,delta.unstable)~cluster(id)+u.rate+as.factor(h.skill)+as.factor(m.skill)+as.factor(non.manual)+as.factor(municipio)+as.factor(spanish.speakers)+ as.factor(no.spanish.speaker)+as.factor(Aged.16.19)+as.factor(Aged.20.24)+as.factor(Aged.25.29)+as.factor(Aged.30.34)+as.factor(Aged.35.39)+ as.factor(Aged.40.44)+as.factor(Aged.45.51)+as.factor(older61)+ as.factor(responsabilities)+
terminal(delta.censor),formula.terminalEvent=~u.rate+as.factor(h.skill)+as.factor(m.skill)+as.factor(municipio)+as.factor(spanish.speakers)+as.factor(no.spanish.speaker)+as.factor(Aged.16.19)+as.factor(Aged.20.24)+as.factor(Aged.25.29)+as.factor(Aged.30.34)+as.factor(Aged.35.39)+as.factor(Aged.40.44)+as.factor(Aged.45.51)+as.factor(older61)+ as.factor(responsabilities),data=cr_05_males_services,n.knots=12,kappa1=1000,kappa2=1000,maxit=1000, Frailty=TRUE,joint=TRUE, recurrentAG=TRUE)
###Be patient. The program is computing ...
###The program took 2259.42 seconds
#delta stable+death
fit.joint.05_males<frailtyPenal(Surv(.t0,.t,delta.stable)~cluster(id)+u.rate+as.factor(h.skill)+as.factor(m.skill)+as.factor(non.manual)+as.factor(municipio)+as.factor(spanish.speakers)+as.factor(no.spanish.speaker)+as.factor(Aged.16.19)+as.factor(Aged.20.24)+as.factor(Aged.25.29)+as.factor(Aged.30.34)+as.factor(Aged.35.39)+as.factor(Aged.40.44)+as.factor(Aged.45.51)+as.factor(older61)+as.factor(responsabilities)+terminal(delta.censor),formula.terminalEvent=~u.rate+as.factor(h.skill)+as.factor(m.skill)+as.factor(municipio)+as.factor(spanish.speakers)+as.factor(no.spanish.speaker)+as.factor(Aged.16.19)+as.factor(Aged.20.24)+as.factor(Aged.25.29)+as.factor(Aged.30.34)+as.factor(Aged.35.39)+as.factor(Aged.40.44)+as.factor(Aged.45.51)+as.factor(older61)+as.factor(responsabilities),data=cr_05_males_services,n.knots=12,kappa1=1000,kappa2=1000,maxit=1000, Frailty=TRUE,joint=TRUE, recurrentAG=TRUE)
###The program took 3167.15 seconds
Because you neither provide information about the packages used, nor the data necessary to run multivepenal or frailtyPenal, I can only help you with the Surv part (because I happened to have that package loaded).
The Surv warning message you provided (In Surv(.t0, .t, delta.stable) : Stop time must be > start time, NA created) suggests that something is strange with your variables .t0 (the time argument in Surv, refered to as 'start time' in the warning), and/or .t (time2 argument, 'Stop time' in the warning). I check this possibility with a simple example
# read the data you feed `Surv` with
df <- read.table(text = "row id contadorbis unst. stable d.censor .t0 .t
1 78 1 0 1 0 0 88
2 101 2 0 1 0 0 46
3 155 3 0 1 0 0 27
4 170 4 0 0 0 0 61
5 170 4 1 0 0 61 86
6 213 5 0 0 0 0 92
7 213 5 0 0 0 92 182
8 213 5 0 0 0 182 273
9 213 5 0 0 0 273 365
10 213 5 1 0 0 365 394
11 334 6 0 1 0 0 6
12 334 7 1 0 0 0 38
13 369 8 0 0 0 0 27
14 369 8 0 0 0 27 119
15 369 8 0 0 0 119 209
16 369 8 0 0 0 209 300
17 369 8 0 0 0 300 392", header = TRUE)
# create survival object
mysurv <- with(df, Surv(time = .t0, time2 = .t, event = stable))
mysurv
# create a new data set where one .t for some reason is less than .to
# on row five .t0 is 61, so I set .t to 60
df2 <- df
df2$.t[df2$.t == 86] <- 60
# create survival object using new data which contains at least one Stop time that is less than Start time
mysurv2 <- with(df2, Surv(time = .t0, time2 = .t, event = stable))
# Warning message:
# In Surv(time = .t0, time2 = .t, event = stable) :
# Stop time must be > start time, NA created
# i.e. the same warning message as you got
# check the survival object
mysurv2
# as you can see, the fifth interval contains NA
# I would recommend you check .t0 and .t in your data set carefully
# one way to examine rows where Stop time (.t) is less than start time (.t0) is:
df2[which(df2$.t0 > df2$.t), ]
I am not familiar with multivepenal but it seems that it does not accept a survival object which contains intervals with NA, whereas might frailtyPenal might do so.
The authors of the package have told me that the function is not finished yet, so perhaps that is the reason that it is not working well.
I encountered the same error and arrived at this solution.
frailtyPenal() will not accept data.frames of different length. The data.frame used in Surv and data.frame named in data= in frailtyPenal must be the same length. I used a Cox regression to identify the incomplete cases, reset the survival object to exclude the missing cases and, finally, run frailtyPenal:
library(survival)
library(frailtypack)
data(readmission)
#Reproduce the error
#change the first start time to NA
readmission[1,3] <- NA
#create a survival object with one missing time
surv.obj1 <- with(readmission, Surv(t.start, t.stop, event))
#observe the error
frailtyPenal(surv.obj1 ~ cluster(id) + dukes,
data=readmission,
cross.validation=FALSE,
n.knots=10,
kappa=1,
hazard="Splines")
#repair by resetting the surv object to omit the missing value(s)
#identify NAs using a Cox model
cox.na <- coxph(surv.obj1 ~ dukes, data = readmission)
#remove the NA cases from the original set to create complete cases
readmission2 <- readmission[-cox.na$na.action,]
#reset the survival object using the complete cases
surv.obj2 <- with(readmission2, Surv(t.start, t.stop, event))
#run frailtyPenal using the complete cases dataset and the complete cases Surv object
frailtyPenal(surv.obj2 ~ cluster(id) + dukes,
data = readmission2,
cross.validation = FALSE,
n.knots = 10,
kappa = 1,
hazard = "Splines")

UNIX Full Outer Join creates duplicate entries items despite correct order? Could it be the unpaired items creating disorder?

I have two files that I want to join based on their first column.
They are sorted, and not all of the values in the first column in FILE1 are in FILE2, and viceversa.
FILE1.TXT looks something like this, except it is around 15k lines:
snRNA:7SK 1037
snRNA:U11 144
snRNA:U1:21D 348.293
snRNA:U12:73B 16
snRNA:U1:82Eb 2.14286
snRNA:U1:95Ca 348.293
snRNA:U1:95Cb 351.96
snRNA:U1:95Cc 35.5095
snRNA:U2:14B 447.35
snRNA:U2:34ABa 459.75
snRNA:U2:34ABb 513.25
snRNA:U2:34ABc 509
snRNA:U2:38ABa 443.65
snRNA:U4:38AB 155
snRNA:U4:39B 611.833
snRNA:U4atac:82E 152.5
snRNA:U5:14B 1
snRNA:U5:23D 2.5
snRNA:U5:34A 11
snRNA:U5:38ABb 2.5
snRNA:U5:63BC 44
snRNA:U6:96Aa 18
snRNA:U6:96Ab 9.5
snRNA:U6:96Ac 8.5
snRNA:U7 4
snRNA:U8 8
FILE2.TXT looks like this, it is also ~15K lines:
snRNA:7SK 1259
snRNA:U11 33
snRNA:U1:21D 1480.57
snRNA:U12:73B 4
snRNA:U1:82Eb 10.2
snRNA:U1:95Ca 1480.57
snRNA:U1:95Cb 1484.03
snRNA:U1:95Cc 114.633
snRNA:U2:14B 4678.89
snRNA:U2:34ABa 4789.93
snRNA:U2:34ABb 5292.22
snRNA:U2:34ABc 5273.23
snRNA:U2:38ABa 4557.88
snRNA:U2:38ABb 3.75
snRNA:U4:38AB 405
snRNA:U4:39B 1503.5
snRNA:U4atac:82E 548
snRNA:U5:14B 25
snRNA:U5:23D 19
snRNA:U5:34A 32
snRNA:U5:38ABb 4
snRNA:U5:63BC 742
snRNA:U6:96Aa 39.5
snRNA:U6:96Ab 1
snRNA:U6:96Ac 1
snRNA:U7 11
As you can see, an element from FILE2 (snRNA:U5:38ABb) is missing IN FILE1, and an element from FILE1 is missing in FILE2. This is the case all through out the files, in both directions and multiple times.
I am writing the command as follows:
join -a1 -a2 -e "0" -1 1 -2 1 -o '0,1.2,2.2' -t ' '
FILE1.TXT FILE2.TXT
>JOIN_FILE.TXT
If I try the command with ONLY the 20 or so lines that I pasted from each file, it works as it should.
But when I run it on the entire files, The output is terrible, and I don't understand why. Both files were sorted using sort -k1,1, so even though some lines in 1 are not in 2, and viceversa, they are both in the same order.
What I get is duplicate entries for an item, such as: (again, I'm only showing a fraction of the output file...)
snRNA:7SK 0 1037
snRNA:U11 0 144
snRNA:U1:21D 0 348.293
snRNA:U12:73B 0 16
snRNA:U1:82Eb 0 2.14286
snRNA:U1:95Ca 0 348.293
snRNA:U1:95Cb 0 351.96
snRNA:U1:95Cc 0 35.5095
snRNA:U2:14B 0 447.35
snRNA:U2:34ABa 0 459.75
snRNA:U2:34ABb 0 513.25
snRNA:U2:34ABc 0 509
snRNA:U2:38ABa 0 443.65
snRNA:U4:38AB 0 155
snRNA:U4:39B 0 611.833
snRNA:U4atac:82E 0 152.5
snRNA:U5:14B 0 1
snRNA:U5:23D 0 2.5
snRNA:U5:34A 0 11
snRNA:U5:38ABb 0 2.5
snRNA:U5:63BC 0 44
snRNA:U6:96Aa 0 18
snRNA:U6:96Ab 0 9.5
snRNA:U6:96Ac 0 8.5
snRNA:U7 0 4
snRNA:7SK 1259 0
snRNA:U11 33 0
snRNA:U1:21D 1480.57 0
snRNA:U12:73B 4 0
snRNA:U1:82Eb 10.2 0
snRNA:U1:95Ca 1480.57 0
snRNA:U1:95Cb 1484.03 0
snRNA:U1:95Cc 114.633 0
snRNA:U2:14B 4678.89 0
snRNA:U2:34ABa 4789.93 0
snRNA:U2:34ABb 5292.22 0
snRNA:U2:34ABc 5273.23 0
snRNA:U2:38ABa 4557.88 0
snRNA:U2:38ABb 3.75 0
snRNA:U4:38AB 405 0
snRNA:U4:39B 1503.5 0
snRNA:U4atac:82E 548 0
snRNA:U5:14B 25 0
snRNA:U5:23D 19 0
snRNA:U5:34A 32 0
snRNA:U5:38ABb 4 0
snRNA:U5:63BC 742 0
snRNA:U6:96Aa 39.5 0
snRNA:U6:96Ab 1 0
snRNA:U6:96Ac 1 0
snRNA:U7 11 0
Where esentially everything has been duplicated, with one line for the value in FILE1 and another line for the value in FILE2. Could this be because of the accumulated differences between the files (i.e., all the non-paired entries before these specific ones?)
This scrambling of the output runs all throughout the file.
What am I doing wrong? Am I not specifying that entries in both files don't always match?
Is there any way to solve this?
Thanks a lot!
Carmen
Edit:
Here are the first 15 lines of each file, in order to show that the order is the same in both, but things start to get different because items in FILE1 start to appear that are not in FILE2, and viceversa. I wonder if this is what causes the mix-up.
==> FILE1 <==
128up 139
140up 170
14-3-3epsilon 4488
14-3-3zeta 24900
18w 885
26-29-p 517
2mit 3085.34
312 64
4EHP 9012.57
5.8SrRNA:CR40454 16.5
5-HT1A 1867
5-HT1B 366
5-HT2 2611.27
5-HT7 1641.67
5PtaseI 462
==> FILE2 <==
128up 80
140up 19
14-3-3epsilon 1718
14-3-3zeta 5554
18w 213
26-29-p 200
2mit 680.786
312 33
4EHP 1838.44
5-HT1A 303
5-HT1B 42
5-HT2 553.65
5-HT7 348.5
5PtaseI 105
5S_DM 46054.4
It is possible that you have "spaces" instead of "tabs" in one of your files.
Your join command seems to give duplicated entries when there is a space in one of the line:
#> bash fjoin.sh
:: join ::
join: s.file1s.txt:2: is not sorted: 128up 139
:: diff ::
1c1,3
< 128up 139 80
---
> 128up 0 80
> 128up 139 0 0
> 128up 139 0
#> grep " " file*txt
file1s.txt:128up 139
#> grep 128up file1s.txt
128up 139
128up 139
fjoin.sh
#!/bin/bash
f1="file1.txt"
f1s="file1s.txt"
f2="file2.txt"
# sort files & remove duplicate
sort -k 1b,1 ${f1} | uniq > s.${f1}
sort -k 1b,1 ${f1s} | uniq > s.${f1s}
sort -k 1b,1 ${f2} | uniq > s.${f2}
echo ":: join ::"
join -a1 -a2 -e "0" -1 1 -2 1 -o '0,1.2,2.2' -t ' ' s.${f1} s.${f2} > joined-1_f1_f2.txt
join -a1 -a2 -e "0" -1 1 -2 1 -o '0,1.2,2.2' -t ' ' s.${f1s} s.${f2} > joined-2_f1_f2.txt
echo " "
echo ":: diff ::"
diff joined-1_f1_f2.txt joined-2_f1_f2.txt
update
Setting LC_ALL=C as Pierre suggested could help.
There are less differences after adding export LC_ALL=C to fjoin.sh :
#> bash fjoin.sh
:: join ::
:: diff ::
1a2
> 128up 139 0 0

Resources