I got a file that seems to not have anything readable into it (for a human)
How can I be sure that it hasn't anything readable for a human? Because it's way too large to read it entirely (maybe a program that searches for words or entropy or I don't know.)
How can I know if this file is compressed or encrypted, or both? And is it possible that it has a proprietary compression so I can't distinguish it from encryption?
Because if I can make sure that it's encrypted, I can stop my work directly, but if it's just encoded/compressed, maybe I can find a way to read it
(I tried to compress it with the basic Windows archiver and it loses 18% of its size. Does it mean that it's not encrypted? Does an encryption permit that much compression?)
Yes, it is certainly possible to create a compression format for which all possible sequences of bits is valid. In that case, you would not be able to distinguish the compressed data from random or encrypted data.
I am not aware of a commonly implemented compressed format that has that property. You could try all of the decompressors you can find on the data to see if any continue to decompress through all of the data without erroring out. You can also try starting at different locations in your data, since there may be some sort of header before the compressed data.
Online Decryption
If you would like to decrypt the file. You could simply copy and paste everything inside of https://online-toolz.com/tools/text-encryption-decryption.php
that feature can decrypt messages fast.
Encoder & Decoder
https://www.base64decode.org/
I found this website a while ago, this website is trusted and fast with great reviews.
This method can also help with your request.
I have a .aes file whose decryption password I thought I knew but which does not yield the decrypted version of the file.
I am 99.9% certain that the password I have (and which, in fact, I had written down and safely stored) is correct. The problem is that the .aes file was generated by a well-known open-source Bitcoin wallet software known as MultiBit which simply stopped working sometime in 2017, with many other users reporting similar problems.
I am told that Multibit may have incorrectly rendered some non alphanumeric characters of my password to whatever internal function it was using to generate the encrypted file. That means in practice that I could potentially crack my .aes file if I cycle through the permutations represented by the question marks in a password string that looks something like this:
i-AM-a-PASS?RD-with-3-UNKN?WN-charact?rs
So I guess my question is: is anyone aware of a regex-based brute-force approach that could be used for cracking .aes encryption passwords? The regex itself may need to employ both ? and * characters.
The amount of Bitcoin in the wallet was absolutely trivial, but with Musk's recent tweets sending Bitcoin to new highs I'm thinking I could buy a spanking new laptop if I can crack this.
Any suggestions most welcome.
Thanks,
I'm using libsodium to encrypt files with xchacha20poly1305 construct. I got everything working correctly by following documentation (https://download.libsodium.org/doc/secret-key_cryptography/secretstream.html) but now I'm wondering about the role of header data.
crypto_secretstream_xchacha20poly1305_init_pull requires the header from crypto_secretstream_xchacha20poly1305_state that was used when the data was encrypted so how should I treat the header data? Is it same as AES' iv/nonce that it needs and can be to be distributed with the encrypted data as-is or is it secret like the key?
I realize this is most likely a newbie question but since I'm obviously not a crypto expect, I want to make sure I use libsodium and the construct right.
Thanks!
That's a pretty old question, but since it was still waiting for an answer, here it is.
The header is indeed a nonce. It doesn't have to be secret. But it is required so that if the same stream is encrypted twice, both ciphertexts will look completely different.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Last week, I had to create a little GUI for homework.
None of my school mates did it. They have stolen my one from where we had to upload it and then they uploaded it again as theirs. When I told my teacher it was all my work he did not believe me.
So I thought of putting a useless method or something inside with a proof that I coded it. I thought of encryption. My best idea up till now:
String key = ("ZGV2ZWxvcGVkIGJ5IFdhckdvZE5U"); //My proof in base64
Can you think of some other better ways?
I had the same problem as you a long time ago. We had Windows 2000 machines and uploaded files to a Novel network folder that everyone could see. I used several tricks to beat even the best thieves: whitespace watermarking; metadata watermarking; unusual characters; trusted timestamping; modus operandi. Here's them in order.
Whitespace watermarking:
This is my original contribution to watermarking. I needed an invisible watermark that worked in text files. The trick I came up with was to put in a specific pattern of whitespace between programming statements (or paragraphs). The file looked the same to them: some programming statements and line breaks. Selecting the text carefully would show the whitespace. Each empty line would contain a certain number of spaces that's obviously not random or accidental. (eg 17) In practice, this method did the work for me because they couldn't figure out what I was embedding in the documents.
Metadata watermarking
This is where you change the file's metadata to contain information. You can embed your name, a hash, etc. in unseen parts of a file, especially EXE's. In NT days, Alternate Data Streams were popular.
Unusual characters
I'll throw this one in just for kicks. An old IRC impersonation trick was to make a name with letters that look similar to another person's name. You can use this in watermarking. The Character Map in Windows will give you many unusual characters that look similar to, but aren't, a letter or number you might use in your source code. These showing up in a specific spot in someone else's work can't be accidental.
Trusted Timestamping
In a nutshell, you send a file (or its hash) to a third party who then appends a timestamp to it and signs it with a private key. Anyone wanting proof of when you created a document can go to the trusted third party, often a website, to verify your proof of creation time. These have been used in court cases for intellectual property disputes so they are a very strong form of evidence. They're the standard way to accomplish the proof you're seeking. (I included the others first b/c they're easy, they're more fun and will probably work.)
This Wikipedia article might help your instructor understand your evidence and the external links section has many providers, including free ones. I'd run test files through free ones for a few days before using them for something important.
Modus operandi
So, you did something and you now have proof right? No, the students can still say you stole the idea from them or some other nonsense. My fix for this was to, in private, establish one or more of my methods with my instructor. I tell the instructor to look for the whitespace, look for certain symbols, etc. but to never tell the others what the watermark was. If the instructor will agree to keep your simple techniques secret, they will probably continue to work fine. If not, there's always trusted timestamping. ;)
If your classmates stole your code from the upload site, I would encrypt your homework and email the key to the teacher. You can do this with PGP if you want to be complicated, or something as simple as a Zip file with a password.
EDIT:
PGP would allow you to encrypt/sign without revealing your key, but you can't beat the shear simplicity of a Zip file with a password, so just pick a new key every homework assignment. Beauty in simplicity :)
If you are giving source code to the teacher, then simply add a serialVersionUID to one of your class files that is an encrypted version of your name. You can decrypt it to the teacher yourself.
That does not mean anything to the others, just for you. You can say it's a generated code, if they're stealing it, probably won't bother to modify it at all.
If you want to do it in a stylish way, you could use this trick, if you find the random seed that produces your name. :) That would be your number then, and wherever it appears that would prove that it was you who made that code.
This happened with a pair of my students who lived in the same apartment. One stole the source code from a disk left in a desk drawer.
The thief slightly modified the stolen source, so that it wouldn't be obvious. I noticed the similarity of the code anyway, and examined the source in an editor. Some of the lines had extra spaces at the ends. Each student's source had the same number of extra spaces.
You could exploit this to encode information without making it visible. You could encode your initials or your student ID at the ends of some lines, with spaces.
A thief will likely make cosmetic changes to the visible code, but may miss the non-visible characters.
EDIT:
Thinking about this a little more, you could use spaces and tabs as Morse-code dits and dahs, and put your name at the end of multiple lines. A thief could remove, reorder or retype some lines without destroying your identification.
EDIT 2:
"Whitespace steganography" is the term for concealing messages in whitespace. Googling it reveals this open-source implementation dating back to the '90s, using Huffman encoding instead of Morse code.
It seems like an IT administration problem to me. Each student should have there own upload area which cannot accessed by other students.
The teacher would be a higher level up, being able to access each student upload folder. If this is not possible go with #exabrial answer as that is the simpliest solution.
The best thing you can do is to just zip the source code with a password and e-mail the password to the teacher.
Problem solved.
Use a distributed (=standalone) version control system, like git. Might be useful too.
A version history with your name, and dates might be sufficiently convincing.
What was stolen ?
The source ? You can put random Strings in it (but it can be changed). You can also try to add a special behavior know only from you (a special keypress will change a color row), you can then ask to the teacher "the others know this special combo ?". Best way will be to crash the program if a empty useless file is not present in the archive after 5 minutes of activity, your school mates will be too lazy to wait this ammount of time.
The binary ? Just comparing the checksum of each .class will be enough (your school mates are too lazy to rewrite the class files)
Just post your solution at the last minute. This won't give time to anyone to copy it.
And send a feedback to the administrator to disallow students to see other students assignments.
If you upload the file in a .zip with password encryption, anyone can just crack the password by downloading the .zip file and have their cpu run a million queries at it if they are that big of a cheat thief. Unfortunately, some are and it's easy to do.
Your source can be viewed on the shared server by the other students. The teacher should really be giving you your own password encrypted directory to upload to. This could be done easily just by adding subdomains. But perhaps the teacher might allow you to upload the files to your own server for him to access them there.
It's also possible to obfuscate the script so that it has a document.write('This page was written by xxxxx'), forcing anyone who copies your work to not be able to remove the credit unless they first decrypt it. But the real answer is that your school needs to give each of its students their own password protected directories.
In my case, my teachers came with a better approach. The questions they provided has something to do with our registration number.
Ex:
Input to a function/theory is our Registration number, which is
different for each student
So, answers or the approach to the solution are relatively different from each student.This make the necessarily of all students has to do their homework on their own, or at-least get to know how to hack the approach with their own registration[it may be harder than learning the lession ;)].
Hope your lecturer will read this thread before his next tutorial :D
I'm building a form and I wonder if there is a significant advantage in showing values in a more human readable format; e.g:
index.php?user=ted&location=newyork
Rather than:
index.php?user=23423&location=34645
On the one hand, having the query string a little more readable allows the user and search engines to better understand where they are, but this also creates a little more work on the server side, as I'll have to track down the associated rows through something other than their unique id.
For example, first find what the id of 'newyork' is before being able to work on other rows that require the location_id. I always prefer to give the db as little work as possible.
Edit: decided to go with readability. I figure I can always use the mysql query cache to speed things up if necessary.
Use human readable values when you can. Just be sure to sanitise the input.
Edit: Yes, this can and should still be done for SEO purposes (if its worth it to you) if you have lots of choices. Even if the user has lots of choices, you should know what they are (or what the limits are) so that you can properly sanitise the input. For instance, if they are choosing states, you can know all 50. If they are just making up their own text, make sure on your end that its only text.
A good rule of thumb is to store data as id's and display it as human readable text etc.
Depends on your goals.
If you are talking about something like a blog where you want everyone to see everthing (and find it easily), then the human/search engine readable format is a no brainer.
If these pages are locked behind a login then it doesn't much matter. You can do what is easier on the database.
For most internet apps, I'd err to the side of readability since that will help with search engines as well.
You shouldn't worry about the efficiency for any typically sized application on any reasonable database engine. Write your app for users, not for query optimizers. QO's can more easily take care of themselves. Deal with optimization in the unlikely event you start seeing a problem.
I am going to come from a different direction. In my opinion a URL should be readable if want the user to be able to use the URL to change their parameters by editing the URL instead of using the UI. An example would be https://www.coolreportingapp.com/accountReport.jsp?account=ABC&month=200911 . In this example, the user can "easily" change the account or month they looking at without messing with the UI. This of course means you need to validate the URL params each and every time, which you should do anyway. If you don't want the user to alter the URL params, you need to obfuscate and hash values and use the hash to verify they haven't.
Seriously, IMHO, in your example, none is more readable than the other. Do normal users know that "&" is a separator from "variables" in "user=ted&location=newyork&"? Do they need to know that exists something like a variable? Having this in mind, what's the difference in showing numbers or words?
If you really want readable urls you should build SEO Friendly urls (human readable). Remember that even a "dashes vs underscores" simple question matters in the end.