Find seed from random numbers

BlitzMax Forums/BlitzMax Programming/Find seed from random numbers

ImaginaryHuman(Posted 2006) [#1]
Does anyone know how I might go about finding a `seed` from a given sequence of random numbers? It sounds either very computationally expensive or highly mathematical?

The sequence of randon numbers can be whatever length, so long as it's at least 4 or so.


Perturbatio(Posted 2006) [#2]
Why would you do this over RndSeed?

(It's a serious question, I have no idea what you would need to do it the convoluted way for).


H&K(Posted 2006) [#3]
You find out which random number code BMax is using, find the equation it is using, then try to "Crack" it.

I think tho that its one of those equations that "Dont go backwards", which I cannot remember the technical term, which basicaly means its unsolveable in reverse

If you really need it. Produce enough of every set of numbers from every possible seed. Then when you want to know which seed gave a specific set. Look it up

(I know that sounds like a lot of work). But my understanding was that these routines are designed so that you cannot do, the specific thing you are trying to do


ImaginaryHuman(Posted 2006) [#4]
Yah. I read about how some pseudo randon number generators are designed to be non-reversable, because in encryption circles where they need a randon number for a key or whatever, it could be decrypted by tracing back the randon number sequence to the original seed.

Also it wouldn't be sufficient to look at BlitzMax's randon number generator or to reverse it because the randon number sequence in question is not generated by BlitzMax's randon number system in the first place.

The randon number sequences I'm trying to reverse could be totally random from any non-blitz source. So I figure you'd have to analyse the sequence and try to come up with a `formulae` reproduces the sequence, and this formulae, with a given seed value, would then describe the sequence in question and could regenerate it.

This is just an idea for data compression. If you can take random data, pictures, files, whatever, and extract a seed and a formula from a small piece of it, that could regenerate it exactly, it would be a really good form of compression. You'd have to probably work on only a short sequence at a time, though.

I think it's too much variability to figure out the possible formulas and reverse them, too much math, and too much time.


H&K(Posted 2006) [#5]
What you have decribed is called Jpeg (Think about it)


ImaginaryHuman(Posted 2006) [#6]
No because JPEG is lossy, this would be lossless, and could do more than one compression pass. It's probably actually closer to fractal compression.


H&K(Posted 2006) [#7]
Thats exactly my point. The lossyness would be due to the series'es(?) that dont exist within the random number generator (If you see what I mean)

Basicaly, if you only have say 1,000,000,000,000,000 posible seeds, then you also only have 1,000,000,000,000,000 possible series

Its a good idea, if you dont Mind a lot of loss. Unless you picked a small "Set" of numbers, then I suppose you could do it. But I dont think that even then you have any garentee that all the series'es available in your closed set would happen


FlameDuck(Posted 2006) [#8]
If you can take random data, pictures, files, whatever
Pictures (well meaningful pictures anyway) do not normally contain random data.

It works with fractals tho'.


ImaginaryHuman(Posted 2006) [#9]
I know pictures aren't greatly random, but there is some degree of randomness. Obviously if a randon number generator would generate the sequence of pixels in an image, it's have to also have the feature that it creates sequences of the same color next to each other, etc.

Also there IS no such thing as 100% random data. Not even supposedly `truly random` data is completely random. There is ALWAYS some kind of degree of a pattern, no matter how small. For example look at random data images generated by some of the sites online, they look random, but there are some pixels that seem to form small lines, some areas darker than others, etc.

Even so that's not the point. The point is trying to find a formulae and related data that can reproduce whatever the data is. And you might note, that fractal compression is also not lossless.


SoggyP(Posted 2006) [#10]
Hello.

I had the same idea years ago - basically as a method for storing C64 memory dumps in as small a space as possible - but never followed through with it. I think it's definitely possible but it's a question of what size chunks of data you want to store.

I think what you would need is a data structure that essentially says:
int seed_value;
int offset_value;
int count;

I don't think I need to explain the above. The difficulty, of course, is knowing what seed value is going to give you the best results and to do that you'd need to go through each random seed value, generate n values and then when doing your compression compare the sequence you've got and look for the longest set of values produced by one of the seeds.

In order for it to be more efficient all you need is for the sequence of values to be longer than the size of the three ints above, what's that 12 bytes? Not too bad. Any sequence of 12 bytes that you can't match just put in as it is.

Then all you need to do is go the list of values and generate a data file.

The main problem I'd guess, as discussed a number of times before, is different puters producing different values from the random function.

Goodbye.

Jes


Chris C(Posted 2006) [#11]
I'd say you'd have to pick random seeds and work through them till you have a pattern match, if ever!....

Fractal compression works by storing coordinates, which show where on fractal a piece of the picture is, in addition scale and rotation are also used.

I would doubt you could even come close to fractal compression, I would be suprised if the idea of storing random seed position and length hasnt been thought of by quite a number of us (I probably thought of this around the same time as SoggyP btw!)

I do wonder if there are any reversable hash algorithms tho!!


ImaginaryHuman(Posted 2006) [#12]
The idea here is to find a seed and possibly a formula that CAN always apply to whatever the data is, so that you don't run into having to include literal data at all. The only place where compression stops is where the input byte total is the same as the amount of bytes to store a single code.

I did try using Max's Rand() function with incrementing seeds, but in an hour-long run period, it only at best found a match for a TWO byte sequence. I tried varying the start point of the seed, it didn't find it at all. Obviously lots depends on the formula.

I think the random compressor thing is possible but I dont see a very fast way of finding the matches. I do believe it would be more accurate than fractal compression - fractal compression is lossy, you know.


H&K(Posted 2006) [#13]
The bigest seed you could use would be a Long. Yes?

So 2^64 different seed which means only 2^64 different series.

Let say we are storing Just Ints between 0 and 255

Pick any series of one number and there are only 2^56 series left. (ie 2^56 series that start with the number we picked)
Pick a second number in our series, 2^48
Pick a third Number, 2^40
Pick a forth Number 2^32
Pick a fifth number 2^24
Pick a sixth number 2^16
Pick a seveth number 2^8
And Finaly an eigth number 0

So if you went to all that effort, you could store a series of Eight ints using your method. ie 64 bits. ie a Long (And Im sure that the first ten number of each of the series are not all unique)

This is not to say its a bad idea, cos Im sure we have all had it


ImaginaryHuman(Posted 2006) [#14]
Um, if you say so


Yeshu777(Posted 2006) [#15]
A LONG is normally the largest bumber you would use, so I can sort of see what H&K is saying.

I just want to ask - why are you doing this - out of idle curiosity.


ImaginaryHuman(Posted 2006) [#16]
Well actually I'm not doing it. I was thinking of doing it but while I do believe it is a possibility I don't think it is achievable in a reasonable amount of time.

I was originally looking into forms of compression and using random number seeds to represent images came to mind. I figured that the only thing capable of compressing `fairly random data` is something that is potentially even more random, like a randon number generator.

It was just an idea along the lines of thinking about creating a custom graphics file format. I'm still going to create the file format but it probably wont use this method.


kenshin(Posted 2006) [#17]
Been thinking about what's been mentioned here for a while. Some time ago I thought about a system similar to this. I don't see any reason to use longs for your dictionary though. You're more likely to find compressable sequences using bytes (or even bits) and the dictionary size would be smaller. You could start with storing 1 bit combinations, then 2 bit, etc..


ImaginaryHuman(Posted 2006) [#18]
Actually it doesnt really have anything to do with creating a dictionary. It's not like huffman compression. Unless you're thinking that the seeds are a dictionary. The sequences of numbers are not stored anywhere, they are meant to be re-generated at decompression time based on the seed.


H&K(Posted 2006) [#19]
@Kenshin,

Do I realy have to give you the example for all combinations of seeds,(long, Int etc) and data (Int,bit)?
What ever seed you use its the same

When this method IS usfull is when used to "Seed" maps. Zarch on the archimedies did a combination of this and sin functions to store levels. And I remember in something, where you type in a number to recreate the same game.
(Ive just rememberd, its one of the card games that come with windows)


kenshin(Posted 2006) [#20]
OK, I thought you were talking about using seeds to generate a dictionary used for compression.
Do I realy have to give you the example for all combinations of seeds,(long, Int etc) and data (Int,bit)?
I don't recall asking you to give an example of anything.
What ever seed you use its the same
Different seeds produce different sequences, so how can it be the same?
And I remember in something, where you type in a number to recreate the same game.
When you start a new game of Archipelagos on Amiga the mouse coordinates are used to generate a seed which in turn is used to generate the layout for random levels. So if you had the mouse in the same location you got the same set of 'random' levels.


H&K(Posted 2006) [#21]
@kenshin

We were talking about seeds to generate a dictionary used for compression

I posted an example, that used arbitary sizes, in what was a general example that holds for all the differnt "Variable" sizes, to show that the dictionary size would be the same as the data size

You then posted
I don't see any reason to use longs for your dictionary though. You're more likely to find compressable sequences using bytes (or even bits)

And so I aksed
Do I realy have to give you the example for all combinations of seeds,(long, Int etc) and data (Int,bit)?

Hopeing that you could see that it applied whatever data/seed size you had chosen

Different seeds produce different sequences, so how can it be the same?
But you didnt. This is probably because I was "loose" in the wording

I don't recall asking you to give an example of anything
No quite right you didnt, but as you had totaly failed to understand the point of the first example, I asummed this was because I hadnt explained it properly, so I just thought Id ask if you needed it explaining to you again.
Which you do. (Only because I dont think I got accross what I was trying to say)

I'll explain.

Let say you are going to store your seed(dictionary word) as an INT. This would give you 2^32 Different seeds. Which in turn will give you 2^32 different series. Lets say the data you want is an INT.

So you look at your seeds to find the series you want, and you take one of the ones that start with the first INT in the series. (what you are doing is divideing the total number of series, by the number of different combinations so far)
There are 2^32 different INT, 2^32/ 2^32 so ONLY one series starts with the number you want.

What ever seed (or data) size you use its the same

(If you store a second int as a pointer to the start of where your data starts in a series, that might be smaller <But before I started to work on such a system I would do the math> {But Im not going to, because I think it would still hold true})

I remember in something, they hadnt used a seed at all, and when you did random game, it was always the same.


kenshin(Posted 2006) [#22]
Yeah, I see what you're getting at now H&K. You're correct too.

The way I see it is that compression works on finding patterns within the dataset to be compressed and placing those patterns in the dictionary. Therefore it's impossible to do lossless compression on anything by using what's essentially a random dictionary. In short, the random dictionary thing doesn't work because the dictionary does not have ANY relation to the dataset to be compressed.