This paper will analyze the current implications of cloud computing on steganography (e.g. Hayati P., et al. 2005) that is based on signal and image processing algorithms. Steganography, which means concealed writing, is the art and science of writing a message in a way that no one except the intended recipient suspects that there is any hidden message at all. The concealed information can be images, text, or any type of binary data. This work will focus on uses of steganography which include the following:
Data Mining vs. Inverse Data Mining
In order to understand the term “inverse data mining”, the better known “data mining” term will be explained.
Data mining, similarly to mineral mining, is the art of extracting objects that are different by some set of properties from their surroundings. In most cases, both in mineral mining and data mining, the valuable material is a very small percentage of the overall volume.
The different properties of the valuable material are exploited to separate them from their environment. For example, iron is separated from surrounding material by either using its magnetic properties or lower melting point (smelting). The genome (Watson J. D. 2003) investigation (Ophir, 2013) is an example of data-mining, where specific sequences, for example ternary tracts, are separated from the whole genome. This operation is very CPU-time consuming and requires the use of supercomputers.
Data mining places, sometimes, even greater challenges than the mineral mining before the potential miner. This statement is expressed in the fact that the data miner doesn’t know what to look for, whereas the mineral miner knows what is he looking for. The data-miner generally looks for something different than the surroundings. This is usually the starting point for most data mining.
Data mining is looking for exceptional data, or conversely, looking for data properties that have common denominators with the whole or part of the investigated data collection. This common denominator will be helpful in further filtering the exceptions.
Statistical tools are generally used for data manipulations to look for deviations. However, a hypothesis should be proposed as to which set of properties should be investigated.
In artificially generated data (unlike the human genome), the complexity of the data-mining is not necessarily lower than naturally generated data. Even modern super-computers can be too slow in resolving a good encryption. We would like to introduce a new term, “inverse data mining” for generating encrypted data. Inverse data mining is generating data, for example messages, in a way that is difficult to decipher. Data mining tries to find common denominators and conversely, inverse data mining tries to hide any common denominators. Steganography encoding, being a sub-domain of the encryption theory, fits the definition of inverse data mining.