It has been itching me for a long time to know what the historical reason for calling daemon programs or threads "daemon"
Lat: daemon, latin version of the Greek "δαίμων" ("godlike power, fate, god")
god,
a subordinate deity, as the genius of a place or a person's attendant spirit
There are numerous questions clarifying what daemons are and how they behave, but none explaining the origins of the term "daemon" for programs that run in the background like sshd.
Why do we title programs that run in the background as daemons?
See the wiki
According to Fernando J. Corbato who worked on Project MAC in 1963 his
team is the first to use the term daemon. The use of the term daemon
was inspired by Maxwell's daemon, in physics and thermodynamics as an
imaginary agent which helped to sort molecules.
"We fancifully began to use the word daemon to describe background
processes which worked tirelessly to perform system chores."
In the Unix System Administration Handbook, Evi Nemeth states the
following about daemons:
"Many people equate the word "daemon" with the word "demon", implying
some kind of satanic connection between UNIX and the underworld. This
is an egregious misunderstanding. "Daemon" is actually a much older
form of "demon"; daemons have no particular bias towards good or evil,
but rather serve to help define a person's character or personality.
The ancient Greeks' concept of a "personal daemon" was similar to the
modern concept of a "guardian angel"—eudaemonia is the state of being
helped or protected by a kindly spirit. As a rule, UNIX systems seem
to be infested with both daemons and demons."
According to Wikipedia:
The term was coined by the programmers of MIT's Project MAC. They took the name from Maxwell's demon, an imaginary being from a thought experiment that constantly works in the background, sorting molecules.
Unix systems inherited this terminology. Maxwell's Demon is consistent with Greek mythology's interpretation of a daemon as a supernatural being working in the background, with no particular bias towards good or evil. However, BSD and some of its derivatives have adopted a Christian demon as their mascot rather than a Greek daemon.
More here.
And here.
And here.
Related
I studied many subjects from physics to logic gates to processor.
I also studied computer architecture, compilers, Assembly x86, operating systems, GPU, ....
All the subjects mentioned above, for some reasons didn't cover what is going "after an executable file being produced by compilers" downward the processor.
Please could you provide me with resources explain these things. because the way I am thinking drive me crazy if I didn't understand why things works the way they are working.
Like I want to understand for-example; why UNIX files start with 'elf'? if you tell me it 's convention. then how the computer as machine understand that a file start by 'elf' being passed to it?
It is a job of operating system. Then how the computer understand the code of operating system ?. I know that processor will read it represented in Binary.
But how really the computer understand the binary? don't tell via transistors and logic gates. What I need to understand how the binary is being signaled to the computer hardware ? how to hardware is really being implemented to understand the binary?
Please any resources about this stage I mention above share it with me ?
Binary is really just electrical signals and small charges stored in memory circuits. The binary data (ex:0101) is just a number represented with 2 symbols instead of 10 (like in base 10 or decimal). What matters, is the way you interpret those numbers to give them meaning.
For example, in processor architectures like x64, the developers of the architecture will create an instruction set that has a certain amount of instructions with a very specific format. The binary data thus becomes interpreted to be those instructions. Since the CPU is manufactured in its logic circuits to understand those instructions, then it can execute them by doing what they should according to the architecture.
The CPU is also configured to start executing at a specific address. The motherboard manufacturer puts machine code in binary at that address to kickstart the CPU. This is called the firmware. The OS developer will put its machine code on the hard-drive at a conventional position and in a conventional format so that the firmware can find and execute it.
Even in electrical terms there needs to be a conventional interpretation of signals. A certain chip on your motherboard might interpret 3V as a high signal and another might be 5V. What matters is that interconnected chips understand each other's interface. For example, an x64 processor might be connected to modern DDR4 RAM. The CPU knows that if it applies a signal (let's say 5V square wave 2600MHz) to certain pins of that chip then each oscillation will write or read data in the memory cells that are selected. It knows that because it expects the RAM module to follow the DDR4 standard. If the RAM module doesn't follow the standard then the CPU will not be able to interact with it.
I know that there are various metrics for measuring the quality of machine translation systems, for example:
Bleu
METEOR
Lepor
Are there somewhere in the public domain metric results for popular translation systems? For example, such as:
Google translate
Yandex Translate
Microsoft translate
Promt
Apertium
Openlogs
Machine translation quality is annually evaluated at the Conference on Machine Translation. Most of the evaluated systems are experimental systems from universities, but most of the systems you mention participate as well. You the results of last year's human evaluation in Table 11 on page 24 of the conference findings.
Most of the systems you mentioned participate anonymously under acronyms online-?, but you can often guess which system is which.
There are two questions:
1) What is the difference between cluster and Grid
2) What is the Cloud
I am not looking for conceptual definitions,
I found a lot of that by googling but the problem is I still do not get it.
so I believe the answer I seek is different. From what I could re-search online I start to think that
many article writers who is trying to explain this either do not understand this deep enough themselves
or not able to explain their knowledge for an average guy like myself (which is common issue with very technical people).
Just to let you know my level: I am a computer programmer, .NET and LAMP, I can do basic admin on both
Linux flavors and Windows, I have hands on experience with Hyper-V and now researching Xen and XCP
to setup a test cloud based on two computers for learning purposes.
Below info you do not have to read, it is just my current understanding of cluster,grid and cloud it
just to support my two questions because I thought it would help to understand
what kind of mess is in my head right now and what answers I am looking for.
Thank you.
Two computers used for reference in my statements are "A" and "B"
specs for A: 2 core intel cpu, 8GB memory , 500gb disk
specs for "B": 2 core intel cpu, 8GB memory , 500gb disk,
Now I would like to look at A and B roles from Cluster, Grid and from Cloud angle.
Common definitions between Grid and Cloud
1) cluster or Grid are 2 or more computers hooked up together, on hardware level
they are hooked up though network cards and on a software level
it is using some kind of program implementing message passing interface
to make it possible to send commands between nodes.
2) cluster or Grid do NOT combine CPU power or memory between nodes, meaning
that in this simulation a FireFox browser running on A still has only one 2 cores cpu,
8GB memory and 500gb available.
Differences between Grid and Cloud:
1) Cluster only provides fail over part, if A node breaks while FireFox is running
the cluster software will re-start FireFox process on node B.
2) Grid however is able to run a software in parallel on multiple nodes at the same time
provided that software is coded with MPI in mind. It can also lunch any software on any node
on demand (even if it is not written for MPI)
3) Grid is also able to combine different type of
nodes, Linux Server, Windows XP, Xbox and Playstation into one Grid.
Cloud definition:
1) Cloud is not a technical term at all, it is just a short convenient word to describe
a computer of unlimited resources, it can aslo be called a Supercomputer, a Beast, an Ocean or Universe but someone
said "Cloud" first and here we are.
2) Cloud can be based on Grids or on Clusters
3) From technical point of view Cloud is a software to combine hardware resources into one,
meaning that if I install Cloud software on Grid or Cluster then it will combine A and B
and I will get one Cloud like this: 4 core CPU, 16gb memory and 1000gb disk.
edited: 2013.04.02
item 3) was a complete nonsense, cloud will NOT combine resources from many nodes into one huge resource, so in this case there will be no 4 core CPU, 16gb memory and 1000gb cloud.
Grid computing is designed to parcel out large workloads to many participating grid members--through software on each member which is expecting to hear that request for computation or for data, and to reply with it's small piece of the overall puzzle. Applications must be written specifically for this approach to problem-solving. It can be heterogeneous because it's not the OS that matters but the software waiting to hear problem-solving requests.
The expectation of a cluster is that it can run the same executable image across any member node--any node can execute that code--which is what drives its requirement for homogeneity. You can write cluster-aware code which distributes workload throughout the cluster, but again you have to write your code to be cluster aware in order to take advantage of more than the redundancy features of a cluster. As most application vendors do not write cluster-aware code, the simple redundancy feature is all that's commonly used in cluster deployments, but that does not limit the architecture. Clusters can and do share their resources, and can collaborate on tasks simultaneously.
Cloud, as it's commonly defined is neither of these, precisely, but it doesn't preclude them, either. Cloud computing assumes the ability to deploy an application without advanced knowledge of it's underlying operating system, or even control of that operating system, coupled with the ability to expand or reduce the processing and memory footprint available to that application without having to destroy and recreate that environment--all done with enough isolation that the application won't know or be able to know what other applications might be installed or running on it's shared infrastructure, unless that access is approved-of by both application managers.
I would like to answer my question before this is closed as a duplicate because I believe it can be very frustrating to find correct info in regards to clusters,grids and clouds and I think this post can save time for many. If someone wants to challenge it please do so, otherwise I will mark it as answer in 1 week.
1) There are many differences and there are none, it really depends on the technical context but
generally you can connect several nodes and call it a Grid or you can call it Cluster. I would say Grid is a Cluster with extended capabilities, such as ability to connect heterogeneous nodes. Both Grid and Cluster will serve as scale-out platform equally good. From Network Engineer and Programmer perspective the difference in implementation or coding will be pretty big if Gird connects heterogeneous nodes.
2) Now the first question was actually a prelude for second one and I believe it is best answered by
Matt Joyce in this post:
https://stackoverflow.com/a/15286488/2230126
I'll take a crack at it. I have been collecting and saving my notes, scripts, and programs since the year 2002 A.D. This is a chop and paste of my statements over the years. Here is a brain friendly memorization list:
The grid is the hardware and hardware specifications.
a. You plug into the router or switch and setup IP addresses and top-level domains over the internet (which is also known as ICANN).
b. This is like OSI level 1, 2, and 3.
The cluster is the kernel (software ring 0 or 1 if its a virtual type thing going on).
a. The kernel is configured (compiled) to run a network stack that can handle sessions, permission, and account authentication.
b. You set up port to port communications usually over TCP/IP (like in the OSI model).
c. You setup iptables, pf, arp, and other OS level applications or shared objects.
d. You can setup ssh, kerberos, ldap, or some other PKI-database and protocol-socket combo.
e. This is like OSI level 4, 5, and 6.
The cloud is user-space applications.
a. The application processes talk to other application-processes within the cluster.
b. You setup process level permissions (via files, cgroups, and/or user-groups).
c. You setup mysql, redis, riak, Message Brokers, hadoop, apache, nginx, cron, java, haskell, erlang, and etcetera.
d. This is like OSI level 7.
The cloud floats over the cluster that grows from the grid. And actually visually think, cloud in the air, cluster in tree, and grid on the ground. Most of us creative types (which make all these technologies) are visual thinkers that can back it up with mathematical data and code. So always see if you can answer the riddle and correlate technological facsimiles to our physical realm here on Earth.
Intro
Grid, Cluster, and Cloud are three different words that mark their specific time in history. Their definitions have intersecting traits and they are modernly interchangeable. You just need to know when to apply the correct or associated word. For example, I was talking to some older M.D.s (medical doctors) and they wanted to know what the cloud was. So I told them that the cloud was a computer cluster that you rent over the internet. And Bingo, they got the idea within 10 seconds.
I will use a little bit of history in chronological prose.
Grid
The term grid is first used to represent one resource that is repeated across terrestrial landscape or space. The term is frequently used during the distribution of telegraphs where repeaters had to be placed on poles every N radii (plural for radius) to amplify the signal. Another example is the electrical grid that Thomas Edison and Nikola Tesla competitively started spreading around the Earth. Computers got really popular and they soon were expanded across The Grid to replace human telegraph (and telephone) operators.
The Grid is now a bunch of computers that can connect and terminate communication channels. The Grid is an infrastructure of computers that function for one goal which is the run assembly (or binary) code.
Cluster
Farseeing the power of computers and actually witnessing computers win wars (Turing's machine), DARPA (or ARPA which is the U.S.A. Military) stepped in.
DARPA started commissioning universities and colleges to utilize the Grid for multi-plexing communication methods (that use baud and protocols). Universities and colleges started making protocols to separate the different tasks that they wanted to carry out over the Grid and target the computers. That started the modern internet. In-house testing clusters were established in laboratories to simulate the grid. Clusters are great for orchestration. A job can be sub-divided over all or some of the slaves within a cluster. The military utilized the college and university's findings and applied the SOFTWARE to the Grid. There were some gotchas with clusters:
Must be same (or near same) hardware
Must have same operating system
The rules were strict because all the instruction-sets had to be the same passing over the CPUs. Clusters usually had a master and slave type relationship. A Cluster usually ran one unic (or unix) job at a time. Clusters had job-schedulers. Then clusters got more complex because hardware manufacturers started making parallel chip architectures (on top of the Von Neumann arch).
Clusters become more powerful. The Clusters inherited more complexity and people were doing more creative things. Cluster could now do different jobs, tasks, processes, asynchronously processes, synchronized processes, and many more interesting things. One box (or computer node) could run more jobs. Now the Grid could be used for multiple purpose. The rate of software updates on clusters was faster than the actual grid. Clusters were deployed locally on campuses. Clusters started superseding the grid because you could directly produce a public facing stack that out-performed the (national) grid.
My Experience
I went to college during the late 1990s and 2000s and cluster was the word for a physical laboratory of multiple computers working as one virtual computer. Clusters were used for testing. Once your software worked on the cluster, then you could mv (move) it to the production grade Grid. Then I witness network worms and computer viruses control zombie computers. These swarm of zombies could be used as one gigantic virtual cluster used to run commands. Well programmers started DIY (do it yourself) protocols and software like bit-torrent and Napster.
So leaping forward into the future, testing cluster softwares are starting to be replaced by Solaris jails, FreeBSD jails, Linux containers, QEMU, hyper-visors, VMWare, VirtualBox, Vagrant, and Docker.
Cloud
Cloud is a marketing term used to umbrella the hardware of different grids and the software of those clusters. Cloud is one big ubiquitous word used to advertise, promote, and profess all that cluster technology for monetary gains. Cloud is also an effort to wrap all those technologies under one singular word. The Cloud allows multi-tenanted processes to share a gigantic grid. The Cloud maximizes efficiency by sub-dividing the electricity, CPU, RAM, DISK, Electricity, and broadband which gets shared and paid for by consumers. A side effect is that those consumer subscriptions and/or pay-rates started producing profit. The Cloud also allows multiple users to install multiple operating systems that run multiple processes all in the software. So now we have acronyms like IaaS, PaaS, and SasS. The Cloud can replace the start-up cost that was once so darn difficult to fund and bootstrap. The Cloud is a great solution for mock testing your software and building a consumer base for your business.
From another perspective, the Cloud triggers the brain of non-programmers to think a certain way. For example, the human resource department can comprehend and isolate what is presented in-front of them.
So if you got the money, then you can purchase your share of the cloud experience and have easy support along with it. But if you have the skill-set, the time, the quick know-how, and the ability to install your own servers at co-locations, then do that because it is cheaper over the long run.
That is my narrative on the Grid vs Cluster vs Cloud.
I think this link well compared the Cluster and Grid.
As I know, there are some exceptions in the case of Clusters. YARN (Yahoo!) tries to handle mutli-tenancy and distributed scheduling. Also Corona (Facebook) has distributed scheduling.
Two types of problems I want to talk about:
Say you wrote a program you want to encrypt for copyright purposes (eg: denying unlicensed user from reading a certain file, or disabling certain features of the program), but most software-based encryption can be broken by hackers (just look at the amount of programs available to HACK programs to become "full versions". )
Say you want to push a software to other users, but want to protect against piracy (ie, the other user making a copy of this software and selling it as their own). What effective way is there to guard against this (similar to music protection on CD's, like DRM)? Both from a software perspective and a hardware perspective?
Or are those 2 belong to the same class of problems? (Dongles being the hardware / chip based solution, as many noted below)?
So, can chip or hardware based encryption be used? And if so, what exactly is needed? Do you purchase a special kind of CPU, special kind of hardware? What do we need to do?
Any guidance is appreciated, thanks!
Unless you're selling this program for thousands of dollars a copy, it's almost certainly not worth the effort.
As others have pointed out, you're basically talking about a dongle, which, in addition to being a major source of hard-to-fix bugs for developers, is a also a major source of irritation for users, and there's a long history of these supposedly "uncrackable" dongles being cracked. AutoCAD and Cubase are two examples that come to mind.
The bottom line is that a determined enough cracker can still crack dongle protection; and if your software isn't an attractive enough target for the crackers to do this, then it's probably not worth the expense in the first place.
Just my two cents.
Hardware dongles, as other people have suggested, are a common approach for this. This still doesn't solve your problem, though, as a clever programmer can modify your code to skip the dongle check - they just have to find the place in your code where you branch based on whether the check passed or not, and modify that test to always pass.
You can make things more difficult by obfuscating your code, but you're still back in the realm of software, and that same clever programmer can figure out the obfuscation and still achieve his desired goal.
Taking it a step further, you could encrypt parts of your code with a key that's stored in the dongle, and require the bootstrap code to fetch it from the dongle. Now your attacker's job is a little more complicated - they have to intercept the key and modify your code to think it got it from the dongle, when really it's hard-coded. Or you can make the dongle itself do the decryption, passing in the code and getting back the decrypted code - so now your attacker has to emulate that, too, or just take the decrypted code and store it somewhere permanently.
As you can see, just like software protection methods, you can make this arbitrarily complicated, putting more burden on the attacker, but history shows that the tables are tilted in favor of the attacker. While cracking your scheme may be difficult, it only has to be done once, after which the attacker can distribute modified copies to everyone. Users of pirated copies can now easily use your software, while your legitimate customers are saddled with an onerous copy protection mechanism. Providing a better experience for pirates than legitimate customers is a very good way to turn your legitimate customers into pirates, if that's what you're aiming for.
The only - largely hypothetical - way around this is called Trusted Computing, and relies on adding hardware to a user's computer that restricts what they can do with it to approved actions. You can see details of hardware support for it here.
I would strongly counsel you against this route for the reasons I detailed above: You end up providing a worse experience for your legitimate customers than for those using a pirated copy, which actively encourages people not to buy your software. Piracy is a fact of life, and there are users who simply will not buy your software even if you could provide watertight protection, but will happily use an illegitimate copy. The best thing you can do is offer the best experience and customer service to your legitimate customers, making the legitimate copy a more attractive proposition than the pirated one.
They are called dongles, they fit in the USB port (nowadays) and contain their own little computer and some encrypted memory.
You can use them to check the program is valud by testing if the hardware dongle is present, you can store enecryption keys and other info in the dongle or sometimes you can have some program functions run in the dongle. It's based on the dongle being harder to copy and reverse engineer than your software.
See deskey or hasp (seem to have been taken over)
Back in the day I've seen hardware dongles on the parallell port. Today you use USB dongles like this. Wikipedia link.
I am learning Computer Networking this semester, on which I find it quite interesting in learning why Internet is designed like today. And I also enjoy reading paper referred to in the teaching slides, like End-to-End Argument in System Design and The Design Philosophy of the DARPA Internet Protocols.
Could you recommend some other interesting paper/articles to me, especially those related to higher layers like TCP/IP protocols?
Many thanks for your reply.
Roy Thomas Fielding wrote his dissertation at UCI about Architectural Styles and
the Design of Network-based Software Architectures (2000), which has more focus on applications in a networked environment; but I would recommend it, since it's related to your request about learning why Internet is designed like today.
Link: Architectural Styles and the Design of Network-based Software Architectures
Generally, the books by Andrew Tanenbaum are worth reading. They are normally used as standard lecture in networking courses.
On the other hand, they are rather basics.
One classic book is "Interconnections 2nd edition" by Radia Perlman.
Another very good one is "Routing in the Internet" by Christian Huitema.
If you are serious about learning how the Internet works, you also need to know about BGP. A decent introduction is "BGP4 Inter-Domain Routing in the Internet" by John Stewart.
A more advanced topic is MPLS. A very good book on this is "MPLS-Enabled Applications" by Ina Minei and Julian Lucek.
Another more advanced topic is multicast. I could recommend "Interdomain Multicast Routing" by Brian Edwards and others.
The books published by Cisco press tend to be good (although obviously vendor-specific) if you are interested in more practical details of how to configure network equipment.
Finally, "Unix Network Programming, Volume 1" by Richard Stevens is a must read if you want to do network programming.
http://www.cs.ucsb.edu/~almeroth/classes/F05.276/papers/vegas.pdf
I'm not sure if this is exactly what you were looking for, but there are some really interesting papers out there on peer-to-peer networking.
Some examples:
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications
Incentives Build Robustness in BitTorrent
Protecting Free Expression Online with Freenet
If you want an introduction to what (not why) the TCP/IP protocols are, a classic book is TCP/IP Illustrated, Volume 1: The Protocols.