A Novel and Seamless Architecture to Mine Encrypted Mobile Users Data over Cloud Server

pp.45-48

Vijay H. Kalmani1, Dinesh Goyal2, Sanjay Singla3
1Research Scholar, Suresh Gyan Vihar University, Jaipur – 302025, India 2Professor, Gyan Vihar School of Engineering and Technology, Suresh Gyan Vihar University, Jaipur – 302025, India 3 Professor, School of Engineering & Technology, IET Bhaddal Technical Campus, Ropar, Punjab, India *Corresponding Author, email: vijaykalmani@gmail.com

Abstract: Cloud computing has made storage and retrieval of data easier than ever before and is constantly paving the way for new research works. Mobile users generally use cloud services for storage of their files due to memory capacity restriction on their devices. Due to the confidentiality of private files stored on the server, servers generally adopt encryption mechanisms to encrypt the files being stored on it. Now the problem arises when a user tries to make a search query on these encrypted files and server has to produce file links relevant to their keyword search query. As files get encrypt on the server, it is not so feasible to decrypt every user file to find the rank of the keyword on those files. Also, some highly relevant files may contain synonyms instead of the exact keywords. We propose a mechanism which not only allows users to store their private files securely on server encrypted using their attribute but also allow making a search on those files effectively and efficiently. Experimental results show the efficiency of the proposed system under different matrices.
Keywords: Mobile Computing, Mobile Search Engine, Synonym Queries

 INTRODUCTION
Cloud computing has set a high benchmark in the field of web services and has opened gates for new research works. Cloud computing offers immense possibilities where users not only can utilize remote server’s power for their data storage and high-speed processing but also enjoy facilities such as round the hour accessibility with seamless transmission [1,7]. It has gained popularity in not only web users but mass numbers of mobile users also have adopted this technology in their various area of work. Security concerns are always a matter of discussion in this area due to huge public access [24]. Most of the mobile users store their private documents on cloud servers and look out for relevant information and documents while searching through app’s search engines using keywords. Search engines have made the life of people easy by allowing them to access relevant information which is stored along with a huge pile of data on server [5,6]. Users can connect to a search engine within their app where they can make a query and get results instantly. There are many
apps which connect to cloud servers and allow mobile devices to attain these services. Cloud servers allow these apps to utilize its infrastructure and perform well with efficiency. Confidentiality of users’ data still remains an issue due to its remote possession. As cloud server poses security threats on files, these files are encrypted and stored on the server. Users should be allowed to access only the documents which are meant for them using fine-grained access mechanism [15,18]. A keyword search on these files needs pre-computation of Term-Frequency (TF) which has to be stored in the database. Also, TF has to be associated with corresponding file item. Again the TF of a keyword is not enough to rank the relevancy of a file but also the TF of keyword’s synonym also has to be checked to ascertain the proper ranking result. Proposed mechanism helps the application to find and display a proper listing of files depending on their rank.

Figure 1: Encrypted Data Search over Cloud

RELATED WORKS
A. Safe Multiple-keyword Ranked Search over Encrypted Cloud Data
So as to make proper protection of saved data, it is an obligation to encrypt the information prior to storing. It is essential to call upon search with the encrypted information as well. The peculiarity of cloud data storage must permit numerous keywords in a private query and outcome the data records in the significance array. In [8], key intend is to get the resolution of multi-keyword ranked search over encrypted cloud data (MRSE) at the same time as saving severe system-wise confidentiality in the cloud computing standard. Different multi- keyword semantics are accessible; a competent comparison assessment of “coordinate matching”, to confine the data records’ pertinent to the search query is employed. Particularly “internal product similarity”, that is to say, numerous query keywords showing in a file, to quantitatively assess such similarity assessment of that file to the search query is employed in MRSE algorithm[9].
The most important restraint of this research was, the user’s identity (ID) is not maintained concealed. Because of this, whoever keeps the information on Cloud Service Provider was identified. This might be unsafe in some circumstances where privacy of information needs to be handled. Therefore, this disadvantage is prevailing over in the proposed method.

B. Privacy Preserving Keyword Searches on Remote Encrypted Data
The issue comes when the user U wishes to hoard his documents in an encrypted type on a distant file server S. Afterward the user U would like to competently get back few of the encrypted documents including particular keywords, keeping the keywords themselves covert and not to jeopardize the safety of the distantly preserved

documents. In [10], resolutions for this issue under precise safety necessities are provided.
The methods are competent because no public-key cryptosystem is comprised. Certainly, the strategy is self-regulating of the encryption technique selected for the inaccessible documents. They are incremental as well. As, user U is able to present new documents which are safe against earlier queries but still searchable against upcoming queries [13, 17].

C. Cryptographic Cloud Storage
Whenever the advantages of making use of a public cloud infrastructure are understandable, it initiates considerable safety and confidentiality threats. In reality, it looks like that the major obstruction to the ad choice of cloud storage involves over the privacy and reliability of information. In [11], the general idea of the advantage of a cryptographic storage service, such as, decreasing the legal contact of both clientele and cloud suppliers, and attaining regulatory conformity is given.

D. Competent and Safe Multi-Keyword Search on Encrypted Cloud Data
On One side, users who don’t essentially have previous information of the encrypted cloud data, should send process each retrieved document so as to get ones most corresponding their interest; then again, always reclaiming all document consisting the queried keyword further invites unwanted network jam, which is completely objectionable in these days’ pay-as-you-use cloud pattern. This journal has specified and resolved the issue of efficient yet safe ranked keyword search over encrypted cloud data [10-14].

METHODOLOGY
System Model
The entire system majorly has two category users, namely, admin and data owners. Here admin initially builds a data dictionary about possible keywords and their synonyms. The user may login to the system to use the services. As any file is selected to upload by the user, the file gets encrypted using mobile’s serial ID. Server requests for the file to upload along with the associated keywords to choose from a set of pre-defined

keywords. Once the file comes to the server, the term frequencies of the keyword and its synonyms are calculated over the file and gets stored in DB as keyword rank. At the time of the search, the keywords are analyzed and file links are listed as per the relevancy rank.
Notations

Table 1: Notations

Data Dictionary

                  Admin adds keywords and associates multiple synonyms to it

File Outsourcing

User logins and outsources files along with the relevant keywords from the set of keywords. The file gets encrypted using mobile device attribute key.

File Keyword

Ranking The server finds an occurrence of keywords and its synonyms in the file and aggregates both frequencies to assert term frequency for each keyword. The file is again processed for encryption using attribute provided by user’s mobile device

File Search

User searches for a set of keywords within his set of files, rank of each keyword is found and divided by a total number of keywords to find average rank of the file for those keywords set. Encrypted file names are shown and user downloads file after decrypting using his attribute.

EXPERIMENTAL RESULTS
Experimental Results were carried out on proposed system model with different data sets on different file sizes. Any plaintext files are valid for upload and further operations.
Table 2: File Size and Keywords Synonyms

Keyword Synonym Nature environment ecosystem Creature animals livingbeings Mobile cell phone phone
Below graph shows file upload time and keyword analysis on the server. Here the chart depicts the time to search keyword and synonym frequencies in the file and also the upload processing time. As the file size varies with 50kb, 100kb, and 500kb, the time required to search terms in the file also varies. We did experiment with different no. of keyword search having 2 synonyms each and found search time is marginally less compared to file upload processing time.

Figure 2: Data Upload Time Graph


Figure 3: Upload Time Chart

Figure 3 shows the bar chart representation of upload and TF search time of keyword and Synonyms in a file sent for upload. The time little varies with the no. of keywords and its synonyms which are suppose to be found in the file to define the ranking. For e.g. time required for processing a 50kb file with 3 keywords (having 2 synonyms each) is 467ms which is little higher proportion than the time taken (2881ms) to process file size of 500kb with 3 keywords.


Figure 4: Encryption Time

For encryption purpose, AES algorithm is used so that combination of high security and efficiency can be maintained. To encrypt a file of 50kb AES takes 6 ms of time, 100kb file takes 8ms and 500kb takes only 11ms. So it can be said that
initial processing only consumes a little time but thereafter encryption has taken very negligible time.


Figure 5: Search Result Time

Figure 5 shows search result time which was processed with 1, 2 and 3 keywords (2 synonyms each). The time took for searching with 1 keyword was 32ms, 2 keywords was 73ms and 3 keyword was 166ms. The processing geometrically increases as 1 keyword indicates 1 + 1 × 2 search and 3 keywords indicate 3 + 3 × 2 search.
Figure 6: Decryption & Download Link Generation Time
Figure 6 shows the time required for decryption of the file and generating links for the raw file to download. Once the file is downloaded, it will get deleted on its own.

CONCLUSION
This paper proposes a novel search mechanism which can be carried out on encrypted mobile data over cloud storage. The concept considers both prime aspects which are important for smooth functioning of the system without any loopholes. Primarily, it assures the confidentiality of the data stored on a semi-trusted server and secondly it maintains the efficiency of the entire system by using indexing and novel search system. The search
mechanism is not just limited up to keyword query search but also takes care of the synonyms of the keywords while generating a ranking. For cipher operations, AES algorithm is used which is a good choice while considering efficiency with security.
Future works can be carried out on Natural Language Processing based searches. Also, the proposed architecture can be slightly upgraded so that user need not specify the keyword but system should itself generate related keywords and ranking of the file on keywords and its synonym.

REFERENCES
[1] P. Mell and T. Grance, “The NIST Definition of Cloud Computing, Version 15,” Nat’l Inst. of Standards and Technology, Information Technology Laboratory, vol. 53, p. 50, http://csrc.nist.gov/groups/SNS/cloud-computing/, 2010.
[2] Lori M. Kaufman, Data security in the world of cloud computing, IEEE Security and Privacy 7 (2009), 61-64.
[3] Sean Carlin Kevin Curran and Mervyn Adams, Security issues in cloud computing, Elixir 38 (2011), 4069-4072.
[4] Hassan Takabi, James B. D. Joshi, and GailJoon Ahn, Security and privacy challenges in cloud computing environments, IEEE Security and Privacy 8 (2010), 24-31.
[5] C. Wang, N. Cao, K. Ren, and W. Lou, “Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data”, Proc. IEEE, Parallel and Distributed Systems, Aug. 2012.
[6] C. Wang, N. Cao, J. Li, K. Ren, and W. Lou, “Secure Ranked Keyword Search over Encrypted Cloud Data,” Proc. IEEE 30th Int’l Conf. Distributed Computing Systems (ICDCS ’10), 2010.
[7] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. Above the clouds: A Berkeley view of cloud computing, Feb 2009.
[8] J. Baek, R. Safavi-Naini, and W. Susilo. Public key encryption with keyword search revisited. In
Proceedings of ICCSA, Part I, ICCSA ’08, pages 1249–1259, 2008.
[9] L. Ballard, S. Kamara, and F. Monrose. Achieving efficient conjunctive keyword searches over encrypted data. In Proc. of ICICS’05, 2005.
[10] F. Bao, R. H. Deng, X. Ding, and Y. Yang. Private query on encrypted data in multi-user settings. In ISPEC’08, pages 71–85, Berlin, Heidelberg, 2008. Springer-Verlag.
[11] E. Shi, J. Bethencourt, T.-H. H. Chan, D. Song, and A. Perrig. Multi-dimensional range query over encrypted data. In IEEE Symposium on Security and Privacy, SP ’07, pages 350–364, 2007.
[12] E. Shi and B. Waters. Delegating capabilities in predicate encryption systems. In ICALP ’08, pages 560–578, 2008.
[13] D. Song, D. Wagner, and A. Perrig. Practical techniques for searches on encrypted data. In Proc. of IEEE S & P ’00, 2000.
[14] Z. Yang, S. Zhong, and R. N. Wright. Privacypreserving queries on encrypted data. In Proc. of ESORICS’06, pages 479–495, 2006.
[15] S. Yu, C. Wang, K. Ren, and W. Lou. Achieving secure, scalable, and fine-grained data access control in cloud computing. In IEEE INFOCOM’10, 2010.
[16] B. Zhu, B. Zhu, and K. Ren. Peksrand: Providing predicate privacy in public-key encryption with keyword search. Cryptology ePrint Archive, Report 2010/466, 2010.
[17] M. Li, S. Yu, N. Cao, and W. Lou. Authorized private keyword search over encrypted data in cloud computing. Technical report, http://ece.wpi.edu/ mingli/, Mar. 2011.
[18] M. Li, S. Yu, K. Ren, and W. Lou. Securing personal health records in cloud computing: Patient-centric and fine-grained data access control in multi-owner settings. In SecureComm’10, pages 89–106, Sept. 2010.