How can I implement an IRC Server with 'owned' nicknames? - networking

Recently, I've been reading up on the IRC protocol (RFCs 1459, 2810-2813), and I was thinking of implementing my own server.
I'm not necessarily looking into adhering religiously to the IRC protocol (I'm doing this for fun, after all), but one of the things I do like about it is that a network can consist of multiple servers transparently.
There are a number of things I don't like about the protocol or the IRC specification. The first is that nicknames aren't owned. While services like NickServ exist, they're not part of the official protocol. On the other hand, implementing something like NickServ properly kind of defeats the purpose of distribution (i.e. there'd be one place where NickServ is running, and one data store for it).
I was hoping there'd be a way to manage nicknames on a per-server basis. The problem with this is that if you have two servers that have some registered nicknames, and they then link up, you can have collisions.
Is there a way to avoid this, without using one central data store? That is: is it possible to keep the servers loosely connected (such that they each exist as an independent entity, but can also connect to one another) and maintain uniqueness amongst nicknames?
I realize this question is vague, but I can't think of a better way of wording it. I'm looking more for suggestions than I am for actual yes/no answers. So if anyone has any ideas as to how to accomplish nickname uniqueness in a network while still maintaining server independence, I'd be interested in hearing it. Note that adhering strictly to the IRC protocol isn't at all necessary; I've got no problem changing things to suit my purposes. :)

There's a simple solution if you don't care about strictly implementing an IRC server, but rather implementing a distributed message system that's like IRC, but not exactly IRC.
The simple solution is to use nicknames in the form "nick#host", much like email. So instead of merely being "mipadi", my nickname could be "mipadi#free-memorys-server.net". So I register with just your server, but when your server links up with others to form another a big ole' chat network, you can easily union all the usernames together. There might be a "mipadi" on otherserver.net, but then our nicknames become "mipadi#free-memorys-server.net" and "mipadi#otherserver.net", and everything is cool.
Of course, this deviates a good deal from IRC. :)

They have to be aware of each other. If not, you cannot prevent the sharing of nicknames. If they are, you simply need to transfer updates on the back-end. To prevent simultaneous registrations, you need a transaction system that blocks, requests permission from all other servers, and responds.
To prevent simultaneous registrations during outages, you have no choice but to timestamp the registration, and remove all but the last (or a random for truly simultaneous) registered copy of the nick.
It's not very pretty considering these servers aren't initially merged in the first place.

You could still implement nick ownership without a central instance, if your server instances trust each other.
When a user registers a nick, it is registered with the current server he's connected with
When a server receives a registration that it didn't know of, it forwards that information to all other servers that don't know it yet (might need a smart algorithm to avoid spamming the network)
When a server re-connects to another server then it tries to synchronize the list of registered nicks and which server handles which nick
If there is a collision during that sync, then the older registration is used, and the newer one marked as invalid
If you can't trust your servers, then it'll get a lot harder, as a servers could easily claim every username and even claim the oldest registration for each one.

Since you are trying to come up with something new, the idea that springs to mind, is simply including something unique about the server as part of the nick name when communicating outside of the server. So if you want to message a user on a different server you might have something like user#server
If you don't need them to be completely separate you might want to consider creating some kind of multiple-master replicated database of accounts. Where each server stores a complete copy of the account database, and each server can create new accounts which will be replicated to other servers as possible. You'll probably still have to deal with collisions on occasion though.

While services like NickServ exist, they're not part of the official protocol.
Services are not part of the official protocol because they've nothing to do with the protocol. They're bots with permissions. There's no reason why you couldn't have one running on each server but it does make them harder to maintain.
If you were to go down that path, I would probably suggest the commonly used "multiple master" database replication technique. If one receives a write (in your case, a new user is created or updated, etc) it sends the data to all the other nodes. You'll have to be careful though. If one node is offline when the others get an update, it will need to know to resync on reconnection.
Another technique would be as above but in reverse. Data is only exchanged between nodes when it's needed. Eg if a user tries to log in on a node where there's no data for it, it'll query the others and issue a move order to get all the data to that one node. This is potentially less painful than the replication version but there could be severe problems in netsplits if somebody signs up on a node disconnected from the pack for a duplicate nick.
One technique to nullify the problems of netsplits would be to make chat nodes and their bots netsplit-aware. When they're split, they probably shouldn't allow any write actions... But this could impact on your network if you're splitting lots.
You've also got to ask how secure this might or might not be. IRC network nodes are distributed for performance but they're not "secure". Because of this, service bots are usually run centrally to keep ultimate control over their running. If you distributed the bots and remote node got hacked, they'd potentially have access to the whole user database (depending on the model).

Related

How to handle network calls in Microservices architecture

We are using Micro services architecture where top services are used for exposing REST API's to end user and backend services does the work of querying database.
When we get 1 user request we make ~30k requests to backend service. We are using RxJava for top service so all 30K requests gets executed in parallel.
We are using haproxy to distribute the load between backend services.
However when we get 3-5 user requests we are getting network connection Exceptions, No Route to Host Exception, Socket connection Exception.
What are the best practices for this kind of use case?
Well you ended up with the classical microservice mayhem. It's completely irrelevant what technologies you employ - the problem lays within the way you applied the concept of microservices!
It is natural in this architecture, that services call each other (preferably that should happen asynchronously!!). Since I know only little about your service APIs I'll have to make some assumptions about what went wrong in your backend:
I assume that a user makes a request to one service. This service will now (obviously synchronously) query another service and receive these 30k records you described. Since you probably have to know more about these records you now have to make another request per record to a third service/endpoint to aggregate all the information your frontend requires!
This shows me that you probably got the whole thing with bounded contexts wrong! So much for the analytical part. Now to the solution:
Your API should return all the information along with the query that enumerates them! Sometimes that could seem like a contradiction to the kind of isolation and authority over data/state that the microservices pattern specifies - but it is not feasible to isolate data/state in one service only because that leads to the problem you currently have - all other services HAVE to query that data every time to be able to return correct data to the frontend! However it is possible to duplicate it as long as the authority over the data/state is clear!
Let me illustrate that with an example: Let's assume you have a classical shop system. Articles are grouped. Now you would probably write two microservices - one that handles articles and one that handles groups! And you would be right to do so! You might have already decided that the group-service will hold the relation to the articles assigned to a group! Now if the frontend wants to show all items in a group - what happens: The group service receives the request and returns 30'000 Article numbers in a beautiful JSON array that the frontend receives. This is where it all goes south: The frontend now has to query the article-service for every article it received from the group-service!!! Aaand your're screwed!
Now there are multiple ways to solve this problem: One is (as previously mentioned) to duplicate article information to the group-service: So every time an article is assigned to a group using the group-service, it has to read all the information for that article form the article-service and store it to be able to return it with the get-me-all-the-articles-in-group-x query. This is fairly simple but keep in mind that you will need to update this information when it changes in the article-service or you'll be serving stale data from the group-service. Event-Sourcing can be a very powerful tool in this use case and I suggest you read up on it! You can also use simple messages sent from one service (in this case the article-service) to a message bus of your preference and make the group-service listen and react to these messages.
Another very simple quick-and-dirty solution to your problem could also be just to provide a new REST endpoint on the articles services that takes an array of article-ids and returns the information to all of them which would be much quicker. This could probably solve your problem very quickly.
A good rule of thumb in a backend with microservices is to aspire for a constant number of these cross-service calls which means your number of calls that go across service boundaries should never be directly related to the amount of data that was requested! We closely monitory what service calls are made because of a given request that comes through our API to keep track of what services calls what other services and where our performance bottlenecks will arise or have been caused. Whenever we detect that a service makes many (there is no fixed threshold but everytime I see >4 I start asking questions!) calls to other services we investigate why and how this could be fixed! There are some great metrics tools out there that can help you with tracing requests across service boundaries!
Let me know if this was helpful or not, and whatever solution you implemented!

Protecting hard-coded data that cannot be available to the user, such as a pass phrase

My program needs to decrypt an encrypted file after it starts up to load data it requires to function. This data cannot be available to the user.
I'm not a cryptography expert, so what is the best way to protect hardcoded passphrases and other tidbits of data from users, debugging software and disassembling software?
I understand that this is probably bad practice but it's essential for me (at least for now).
If there are other ways to protect my data from the above 3, could you let me know what those are?
Short answer: you can't. Once the software is on the user's disk, a sufficiently smart and determined user will be able to extract the secret data from it.
For a longer answer, see "Storing secrets in software" on the security.SE blog.
what is the best way to protect hardcoded passphrases and other
tidbits of data from users, debugging software and disassembling
software?
Request the password from the user and don't hardcode the passphrase. This is the ONLY way to be safe.
If you can't do that and must be hardcoded in the app then all bets are off.
The simplest thing you can do (if you don't have the luxury to do something elaborate which will only delay the inevidable) is to delegate the responsibility to the user of the system.
I mean explicitely state that you software is as secure as the "machine" it runs.
If the attacker has access to start pocking around the file system then your app would be the user's least of concerns
In my experience this type of questions are often motivated by either of four reasons:
Your application is connecting to a restricted remote service, such as a database server.
You do not want your users to mess with configuration settings, which in turn do not really have to be kept confidential as long as they are unmodified.
Copy protection of your own software.
Copy protection of data.
Like Illmari Karonen wrote in his answer, you can't do exactly what you are asking for, and this means in particular that 3 & 4 cannot be solved by cryptography alone.
However, if your reason for asking is either 1 or 2, you have ended up asking the questions you do, because you have made some bad decisions earlier in your design process. For instance, in case of 1, you should not make a restricted service accessible from systems you do not trust completely. The typical safe solution is to introduce a middle tier that is the only client to your restricted resource, and which you can make public.
In case of 2, the best solution is often to use exactly the same logic for checking your configuration files (or registry settings or what ever) when they are loaded at start up, as you use for checking consistency when the user enters them using your preferred configuration user interface. If you spot an inconsistency, just bring up your configuration UI and highlight the problem.

Multiple requests to server question

I have a DB with user accounts information.
I've scheduled a CRON job which updates the DB with every new user data it fetches from their accounts.
I was thinking that this may cause a problem since all requests are coming from the same IP address and the server may block requests from that IP address.
Is this the case?
If so, how do I avoid being banned? should I be using a proxy?
Thanks
You get banned for suspicious (or malicious) activity.
If you are running a normal business application inside a normal company intranet you are unlikely to get banned.
Since you have access to user accounts information, you already have a lot of access to the system. The best thing to do is to ask your systems administrator, since he/she defines what constitutes suspicious/malicious activity. The systems administrator might also want to help you ensure that your database is at least as secure as the original information.
should I be using a proxy?
A proxy might disguise what you are doing - but you are still doing it. So this isn't the most ethical way of solving the problem.
Is the cron job that fetches data from this "database" on the same server? Are you fetching data for a user from a remote server using screen scraping or something?
If this is the case, you may want to set up a few different cron jobs and do it in batches. That way you reduce the amount of load on the remote server and lower the chance of wherever you are getting this data from, blocking your access.
Edit
Okay, so if you have not got permission to do scraping, obviously you are going to want to do it responsibly (no matter the site). Try gather as much data as you can from as little requests as possible, and spread them out over the course of the whole day, or even during times that a likely to be low load. I wouldn't try and use a proxy, that wouldn't really help the remote server, but it would be a pain in the ass to you.
I'm no iPhone programmer, and this might not be possible, but you could try have the individual iPhones grab the data so all the source traffic isn't from the same IP. Just an idea, otherwise just try to be a bit discrete.
Here are some tips from Jeff regarding the scraping of Stack Overflow, but I'd imagine that the rules are similar for any site.
Use GZIP requests. This is important! For example, one scraper used 120 megabytes of bandwidth in only 3,310 hits which is substantial. With basic gzip support (baked into HTTP since the 90s, and universally supported) it would have been 20 megabytes or less.
Identify yourself. Add something useful to the user-agent (ideally, a link to an URL, or something informational) so we can see your bot as something other than "generic unknown anonymous scraper."
Use the right formats. Don't scrape HTML when there is a JSON or RSS feed you could use instead. Heck, why scrape at all when you can download our cc-wiki data dump??
Be considerate. Pulling data more than every 15 minutes is questionable. If you need something more timely than that ... why not ask permission first, and make your case as to why this is a benefit to the SO community and should be allowed? Our email is linked at the bottom of every single page on every SO family site. We don't bite... hard.
Yes, you want an API. We get it. Don't rage against the machine by doing naughty things until we build it. It's in the queue.

Is jumping across servers necessarily a bad programming practice?

I have a system created that a user at one of our other locations and on their server inserts a record. That data is then replicated to a central server. Users working on the central server are allowed to edit that record which means I have to lock the editing capabilities at the location the record was created.
However, i would like the creator of the record to be able to edit it so I am thinking about redirecting them across servers to edit the record on the central server. IS that bad practice and why??
The reason I only allow editing on one of the copies of the record is to prevent it from being copied over in replication.....
WE are also considering Bi-directional replication.
It's not a bad practice. Not sure it makes a lot of sense (to me), but you could do it without causing problems, it seems.
But you should consider why are you using replication at all, then. Whatever reasons you come-up with, see how those fit with connecting to your central server. Since the main point of replication is to allow you to operate without connecting to the central server (except when you sync.)
Also consider that replication has some syncing / update options that might help you achieve your goal in a different way.
In a past job life we used SQL Server transactional replication from a central publisher to three subscriber servers. Latency was such that the users clamored for (and received) the ability to update replicated data on the destination servers which resulted in replication failures/conflicts.
It sounds like the overall design to use replication perhaps had not taken into account a desire (or need) to update the subscriber directly. Is is possible to do a quick white board with key players on this?

Caching data and notifying clients about changes in data in ASP.NET

We are thinking to make some architectural changes in our application, which might affect the technologies we'll be using as a result of those changes.
The change that I'm referring in this post is like this:
We've found out that some parts of our application have common data and common services, so we extracted those into a GlobalServices service, with its own master data db.
Now, this service will probably have its own cache, so that it won't have to retrieve data from the db on each call.
So, when one client makes a call to that service that updates data, other clients might be interested in that change, or not. Now that depends on whether we decide to keep a cache on the clients too.
Meaning that if the clients will have their own local cache, they will have to be notified somehow (and first register for notifications). If not, they will always get the data from the GlobalServices service.
I need your educated advice here guys:
1) Is it a good idea to keep a local cache on the clients to begin with?
2) If we do decide to keep a local cache on the clients, would you use
SqlCacheDependency to notify the clients, or would you use WCF for
notifications (each might have its cons and pros)
Thanks a lot folks,
Avi
I like the sound of your SqlCacheDependency, but I will answer this from a different perspective as I have worked with a team on a similar scenario. We created a master database and used triggers to create XML representations of data that was being changed in the master, and stored it in a TransactionQueue table, with a bit of meta data about what changed, when and who changed it. The client databases would periodically check the queue for items it was interested in, and would process the XML and update it's own tables as necessary.
We also did the same in reverse for the client to update the master. We set up triggers and a TransactionQueue table on the client databases to send data back to the master. This in turn would update all of the other client databases when they next poll.
The nice thing about this is that it is fairly agnostic on client platform, and client data structure, so we were able to use the method on a range of legacy and third party systems. The other great point here is that you can take any of the databases out of the loop (including the master - e.g. connection failure) and the others will still work fine. This worked well for us as our master database was behind our corporate firewall, and the simpler web databases were sitting with our ISP.
There are obviously cons to this approach, like race hazard, so we were careful with the order of transaction processing, error handling, de-duping etc. We also built a management GUI to provide a human interaction layer before important data was changed in the master.
Good luck! Tim

Resources