Hi everybody,
I have just posted my new code in the Programming Forum for those interested in new algorithm.
It is a completely new way of sorting strings.
There is no trashing around with millions of strings as I simply read the data in numeric arrays using the Asc value of each characters and use my algorithm to predict where each byte must go to be in sorted order.
I could use all 256 Asc codes if needed and Ram permitting.
The final array is computed from scratch with my math formula.
It is like being beamed in a Start Trek movie where you are supposed to be desintegrated and recreated at the other end.
The difference in speed with ARRAY SORT grow as the files get bigger.
The elapsed times are :
As my code is using a lot of memory, the speed will vary a bit each time depending probably upon Windows internal swap files but ARRAY SORT stay about the same.
But my sort is always faster than PB's
I have separated the reading and counting part as it would be done when reading the data from the hard disk and the counting will not add much overall time to that process.
By the way, in the test, ARRAY SORT get to skip that part but in the real world it will have to read the original data too from disk so it would add some more time to that benchmark.
Feel free to test my code and any comments for improving the process are welcome
I have just posted my new code in the Programming Forum for those interested in new algorithm.
It is a completely new way of sorting strings.
There is no trashing around with millions of strings as I simply read the data in numeric arrays using the Asc value of each characters and use my algorithm to predict where each byte must go to be in sorted order.
I could use all 256 Asc codes if needed and Ram permitting.
The final array is computed from scratch with my math formula.
It is like being beamed in a Start Trek movie where you are supposed to be desintegrated and recreated at the other end.
The difference in speed with ARRAY SORT grow as the files get bigger.
The elapsed times are :
Code:
Size Reading and counting Actual sorting Total time PB Array Sort 1 million 0.5781 0.3281 0.9062 1.0156 2 1.1250 0.7500 1.8750 2.3281 4 2.2656 0.9375 3.2031 5.4531 8 4.5156 1.4062 5.9218 12.9842 16 9.9218 1.6875 11.6093 30.1875
But my sort is always faster than PB's
I have separated the reading and counting part as it would be done when reading the data from the hard disk and the counting will not add much overall time to that process.
By the way, in the test, ARRAY SORT get to skip that part but in the real world it will have to read the original data too from disk so it would add some more time to that benchmark.
Feel free to test my code and any comments for improving the process are welcome