Is it ever possible to reduce pg_num for specific pool - bigdata

It's sad that I found it's not allowed by ceph cli to decrease the value of pg_num for a specific pool.
ceph osd pool set .rgw.root pg_num 32
The error is shown:
Error EEXIST: specified pg_num 32 <= current 128
The tutorial from placement-groups is about to tell me what is it and how to set the best value of it. But there is seldom any tutorial about how to reduce the pg_num without re-installing ceph or delete the pool firstly, like ceph-reduce-the-pg-number-on-a-pool.
The existed SO thread ceph-too-many-pgs-per-osd shows us how to decide the best value. If I met the issue, how can I recover from the mess?
If it's not easy to reduce the value pg_num, what's the story behind it? Why doesn't ceph expose the interface to reduce it?

Nautilus version allows pg_num changes without restrictions (and pg_autoscale).
If you want to increase/reduce pg_num/pgp_num values without having to create, copy & rename pools (as suggested on your link), the best option is to upgrade to Nautilus.

Related

Azure Machine Learning throws error "Invalid graph: You have invalid compute target(s) in node(s)" while running the pipeline

I am facing a strange issue while dealing with Azure Machine Learning (Preview) interface.
I have designed a training pipeline, which was getting initiated on certain compute node (2 node cluster, with minimal configurations). However, it used to take lot of time for execution. So, I tried to create a new training cluster (8 node cluster, with higher config). During this process, I had created and deleted some of the training clusters.
But, strangely, since then, while submitting the pipeline I am getting error as "Invalid graph: You have invalid compute target(s) in node(s)".
Could you please advise on this situation.
Thanks,
Mitul
I bet this was pretty frustrating. A common debugging strategy I have is to delete compute targets and create new ones. Perhaps this was another "transient" error?
The issue should have been fixed and will be rolled out soon. Meanwhile, as a temporary solution, you can refresh the page to make it work.

Why Symfony3 so slow?

I installed Symfony3 framework-standard-edition. I'm trying to open the home page(app.php prod) and it is loaded 300-400ms.
This is my profiler information:
also I use php7.
Why it is so long?
You can try to optimize Zend OPCache.
Here are some recommended settings
opcache.revalidate_freq
Basically put, how often (in seconds) should the code cache expire and check if your code has changed. 0 means it checks your PHP code every single request (which adds lots of stat syscalls). Set it to 0 in your development environment. Production doesn't matter because of the next setting.
opcache.validate_timestamps
When this is enabled, PHP will check the file timestamp per your opcache.revalidate_freq value.
When it's disabled, opcache.revaliate_freq is ignored and PHP files are NEVER checked for updated code. So, if you modify your code, the changes won't actually run until you restart or reload PHP (you force a reload with kill -SIGUSR2).
Yes, this is a pain in the ass, but you should use it. Why? While you're updating or deploying code, new code files can get mixed with old ones— the results are unknown. It's unsafe as hell
opcache.max_accelerated_files
Controls how many PHP files, at most, can be held in memory at once. It's important that your project has LESS FILES than whatever you set this at. For a codebase at ~6000 files, I use the prime number 8000 for maxacceleratedfiles.
You can run find . -type f -print | grep php | wc -l to quickly calculate the number of files in your codebase.
opcache.memory_consumption
The default is 64MB. You can use the function opcachegetstatus() to tell how much memory opcache is consuming and if you need to increase the amount.
opcache.interned_strings_buffer
A pretty neat setting with like 0 documentation. PHP uses a technique called string interning to improve performance— so, for example, if you have the string "foobar" 1000 times in your code, internally PHP will store 1 immutable variable for this string and just use a pointer to it for the other 999 times you use it. Cool.
This setting takes it to the next level— instead of having a pool of these immutable string for each SINGLE php-fpm process, this setting shares it across ALL of your php-fpm processes. It saves memory and improves performance, especially in big applications.
The value is set in megabytes, so set it to "16" for 16MB. The default is low, 4MB.
opcache.fast_shutdown
Another interesting setting with no useful documentation. "Allows for faster shutdown".
Oh okay. Like that helps me. What this actually does is provide a faster mechanism for calling the destructors in your code at the end of a single request to speed up the response and recycle php workers so they're ready for the next incoming request faster.
Set it to 1 and turn it on.
opcache=1
opcache.memory_consumption=256
opcache.interned_strings_buffer=16
opcache.max_accelerated_files=8000
opcache.validate_timestamps=0
opcache.revalidate_freq=0
opcache.fast_shutdown=1
I hope it will help improve your performances
[EDIT]
You might also want to look at this answer:
Are Doctrine relations affecting application performance?
TheMrbikus, try some optimization with the following elements:
Use APC
Use Bootstrap files
Reference: http://symfony.com/doc/current/performance.html
Use the OPCache PHP7
Use Apache PHP-FPM.
E-mail sending process, and may slow down during the form rendering operations. Create a blank test Controller.

How to configure Asterisk realtime with mysql properly?

I currently have over 1k realtime users setup on a MySQL server(only 10-20 users will register simulatneously) for Asterisk. The problem is the sip is not registering evertime. Sometimes I get 'registration timeout'. Is there a setup guide or a setting which I need to configure in order to have >99% successful registrations?
Never faced the issue as I have fewer users.
But according to the aterisk documentation:
If you have problems with your network connection going up and down (e.g. an unreliable cable connection) and you keep losing your sip registry, you may want to add registerattempts and registertimeout settings to the general section above the register definitions.
Setting registerattempts=0 will force Asterisk to attempt to reregister until it can (the default is 10 tries).
registertimeout sets the length of time in seconds between registration attempts (the default is 20 seconds).
About achieving 99% success:
I think you have to study your system and apply setting to the above variables accordingly (dynamically). I suggest using Markovian models like mm1 simulation if your system is not complicated.

Indefinite Provisioning of EMR Cluster with Segue in R

I am trying to use the R package called Segue by JD Long, which is lauded as the ultimate in simplicity for using R with AWS by a book I read called "Parallel R".
However, for the 2nd day in a row I've run into a problem where I initiate the creation of a cluster and it just says STARTING indefinitely.
I tried this on OS X and in Linux with clusters of sizes 2, 6, 10, 20, and 25. I let them all run for at least 6 hours. I have no problem starting a cluster in the AWS EMR Management Console, though I have no clue how to connect Segue/R to a cluster that was started in the Management Console instead of via createCluster().
So my question is -- is there either some way to trouble shoot the provisioning of the cluster or to bypass the problem by creating the cluster manually and somehow getting Segue to work with that?
Here's an example of what I'm seeing:
library(segue)
Loading required package: rJava
Loading required package: caTools
Segue did not find your AWS credentials. Please run the setCredentials() function.
setCredentials("xxx", "xxx")
emr.handle <- createCluster(numInstances=10)
STARTING - 2013-07-12 10:36:44
STARTING - 2013-07-12 10:37:15
STARTING - 2013-07-12 10:37:46
STARTING - 2013-07-12 10:38:17
.... this goes on for hours and hours and hours...
UPDATE##: After 36 hours and many attempts that failed, this began working (randomly...) when I tried it with 1 node. I then tried it with 10 nodes and it worked great. To my knowledge nothing changed locally or on AWS...
I am answering my own question on behalf of the AWS support rep who gave me the following belated explanation:
The problem with the EMR creation is with the Availability Zone specified (us-east-1c), this availability zone is now constrained and doesn't allow the creation of new instances, so the job was trying to create the instances in a infinite loop.
You can see information about constrained AZ here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-regions-availability-zones
"As Availability Zones grow over time, our ability to expand them can become constrained. If this happens, we might restrict you from launching an instance in a constrained Availability Zone unless you already have an instance in that Availability Zone. Eventually, we might also remove the constrained Availability Zone from the list of Availability Zones for new customers. Therefore, your account might have a different number of available Availability Zones in a region than another account."
So you need to specify another AZ, or what I recommend is not specify any AZ, so EMR is going to be able to select any available.
I found this thread: https://groups.google.com/forum/#!topic/segue-r/GBd15jsFXkY
on Google Groups, where the topic of availability zones came up before. The zone that was set as the new default in that thread was the zone causing problems for me. I am attempting to edit the source of Segue.
Jason, I'm the author of Segue so maybe I can help.
Please look under the details section in the lower part of the AWS console and see if you can determine if the bootstrap sequences completed. This is an odd problem because typically an error at this stage is pervasive across all users. However I can't reproduce this one.

have R halt the EC2 machine it's running on

I have a few work flows where I would like R to halt the Linux machine it's running on after completion of a script. I can think of two similar ways to do this:
run R as root and then call system("halt")
run R from a root shell script (could run the R script as any user) then have the shell script run halt after the R bit completes.
Are there other easy ways of doing this?
The use case here is for scripts running on AWS where I would like the instance to stop after script completion so that I don't get charged for machine time post job run. My instance I use for data analysis is an EBS backed instance so I don't want to terminate it, simply suspend. Issuing a halt command from inside the instance is the same effect as a stop/suspend from AWS console.
I'm impressed that works. (For anyone else surprised that an instance can stop itself, see notes 1 & 2.)
You can also try "sudo halt", as you wouldn't need to run as a root user, as long as the user account running R is capable of running sudo. This is pretty common on a lot of AMIs on EC2.
Be careful about what constitutes an assumption of R quitting - believe it or not, one can crash R. It may be better to have a separate script that watches the R pid and, once that PID is no longer active, terminates the instance. Doing this command inside of R means that if R crashes, it never reaches the call to halt. If you call it from within another script, that can be dangerous, too. If you know Linux well, what you're looking for is the PID from starting R, which you can pass to another script that checks ps, say every 1 second, and then terminates the instance once the PID is no longer running.
I think a better solution is to use the EC2 API tools (see: http://docs.amazonwebservices.com/AWSEC2/latest/APIReference/ for documentation) to terminate OR stop instances. There's a difference between the two of these, and it matters if your instance is EBS backed or S3 backed. You needn't run as root in order to terminate the instance - the fact that you have the private key and certificate shows Amazon that you're the BOSS, way above the hoi polloi who merely have root access on your instance.
Because these credentials can be used for mischief, be careful about running API tools from a given server, you'll need your certificate and private key on the server. That's a bad idea in the event that you have a security problem. It would be better to message to a master server and have it shut down the instance. If you have messaging set up in any way between instances, this can do all the work for you.
Note 1: Eric Hammond reports that the halt will only suspend an EBS instance, so you still have storage fees. If you happen to start a lot of such instances, this can clutter things up. Your original question seems unclear about whether you mean to terminate or stop an instance. He has other good advice on this page
Note 2: A short thread on the EC2 developers forum gives advice for Linux & Windows users.
Note 3: EBS instances are billed for partial hours, even when restarted. (See this thread from the developer forum.) Having an auto-suspend close to the hour mark can be useful, assuming the R process isn't working, in case one might re-task that instance (i.e. to save on not restarting). Other useful tools to consider: setTimeLimit and setSessionTimeLimit, and various checkpointing tools (I have a Q that mentions a couple). Using an auto-kill is useful if one has potentially badly behaved code.
Note 4: I recently learned of the shutdown command in package fun. This is multi-platform. See this blog post for commentary, and code is here. Dangerous stuff, but it could be useful if you want to adapt to Windows. I haven't tried it, though.
Update 1. Three more ideas:
You could use .Last() and runLast = TRUE for q() and quit(), which could shut down the instance.
If using littler or a script that invokes the script via Rscript, the same command line functions could be used.
My favorite package of today, tcltk2 has a neat timer mechanism, called tclTaskSchedule() that can be used to schedule the execution of an expression. You could then go crazy with the execution of stuff just before a hourly interval has elapsed.
system("echo 'rootpassword' | sudo halt")
However, the downside is having your root password in plain text in the script.
AFAIK those ways you mentioned are the only ones. In any case the script will have to run as root to be able to shut down the machine (if you find a way to do it without root that's possibly an exploit). You ask for an easier way but system("halt") is just an additional line at the end of your script.
sudo is an option -- it allows you to run certain commands without prompting for any password. Just put something like this in /etc/sudoers
<username> ALL=(ALL) PASSWD: ALL, NOPASSWD: /sbin/halt
(of course replacing with the name of user running R) and system('sudo halt') should just work.

Resources