这是一篇来自澳洲的关于网络安全的新兴课题的代码代写
1 Overview
The objective of this assignment is to assess the learning outcomes of searching over encrypted data. Specifically, the tasks will evaluate your knowledge of security analysis of searchable symmetric encryption (SSE), count attack against SSE, countermeasures for SSE schemes, and inference attacks against encrypted databases built from property-preserving encryption.
2 Submission Policy
You need to submit a report (one single PDF file) to describe what you have done and what you have observed with screenshots whenever necessary; you also need to provide explanations or codes to the observations that are related to the tasks. In your report, you are expected to answer all the questions listed in this manual. Typeset your report into .pdf format (make sure it can be opened with Adobe Reader) and name it as the format:
[Your Name]-[Student ID]-FIT5124-Assignment1,HarryPotter-12345678-FIT5124-Assignment1.pdf.
All source code if required should be embedded in your report. Please upload the PDF file to Moodle. Note that the assignment is due on 1 April, Saturday, 11:55 pm (No extension).
Late submission penalty: 10-point deduction per day. Your assignment will not be marked after a delay of 7 days. If you require special consideration, the application should be submitted at least three days in advance via Monash Connect (https://www.monash.edu/connect). Zero tolerance on plagiarism: If you are found cheating, penalties will be applied, i.e., a zero grade for the unit. The demonstration video is also used to detect/avoid plagiarism.
University policies can be found at
https://www.monash.edu/students/academic/policies/academic-integrity
Generative AI tools are not restricted for this assessment task: In this assessment, you can use generative artificial intelligence (AI) to assist you in any way. Any use of generative AI must be appropriately acknowledged. How to cite the use of generative AI can be found here:
https://www.monash.edu/learning-teaching/teachhq/Teaching-practices/artificial-intelligence/policy-and-practice-guidance-around-acceptable-and-responsible-use-of-ai-te chnologies. Note that there will be one subtask that requires you to use ChatGPT to complete (Task 3.2).
3 Security Analysis of Searchable Symmetric Encryption (30 Marks)
Perform the formal security analysis for the scheme in the paper of “Structured Encryption and Controlled Disclosure” https://eprint.iacr.org/2011/010.pdf (see the screenshot below).
More detailed construction and notations can be found in the paper.
3.1 Security Definition (10 Marks)
Please give the simulation-based security definition for the above scheme regarding non-adaptive security.
3.2 Security Analysis (20 Marks)
Please perform the formal security analysis based on the security definition and the construction. Hint: leakage function definition (5 marks), steps on how to simulate the security game (5 marks), explanation on how the simulated view is indistinguishable from the real view (5 marks).
Interaction with ChatGPT (5 marks): please kindly note that you will interact with ChatGPT to answer this question. Please list your prompts and the responses from ChatGPT, perform correctness analysis on the ChatGPT response, and show how you fix the errors or inaccurate statements if any.
4 Count Attacks against Searchable Symmetric Encryption (40 marks)
4.1 Case study I: given a document set {d1, d2, d3, d4, d5, d6}, and d1 = {w1, w2,w3, w4, w6}, d2 = {w1, w2, w3, w6}, d3 = {w1, w2, w4, w6}, d4 = {w1, w4, w6}, d5 ={w4, w5, w6}, d6 = {w4, w6}, and assume that the adversary knows the above information about the document set. Then the adversary starts to launch count attacks for the document set encrypted by an SSE scheme. He observed the following query response from the server.
Note that e1,…, e6 are the permuted document IDs and the adversary does not know the mapping between them and the original ones.
Q: Please explain how to recover the underlying keyword of each query step by step,and write down the mapping between queries and keywords (10 marks).
4.2 Case study II: given a document set {d1, d2, d3, d4, d5, d6}, and d1 = {w1, w2,w3, w4, w6}, d2 = {w2, w3, w6}, d3 = {w1, w3, w4, w6}, d4 = {w1, w4, w6}, d5 = {w4,w5, w6}, d6 = {w5}, and assume that the adversary knows the above information about the document set. Then the adversary starts to launch count attacks for the document set encrypted by an SSE scheme. He observed the following query response from the server.
Q: Please explain how to recover the underlying keyword of each query step by step,and write down the mapping between queries and keywords. Can the count attack recover all queries? If not, please explain the reason. (15 marks)
4.3 Case study III: given a document set {d1, d2, d3, d4, d5, d6}, and d1 = {w1, w2,w3, w6}, d2 = {w2, w3, w4, w6}, d3 = {w1, w2, w3, w4, w6}, d4 = {w1, w3, w4, w6},d5 = {w5, w6}, d6 = {w5, w6}, and assume that the adversary knows the above information about the document set. Then the adversary starts to launch count attacks for the document set encrypted by an SSE scheme. He observed the following query response from the server.
Q: Please explain how to recover the underlying keyword of each query step by step,and write down the mapping between queries and keywords. Can the count attack recover all queries? If not, please explain the reason. (15 marks)
5 Padding Countermeasures in Searchable Symmetric Encryption (20 Marks)
The primary countermeasure for the above count attack is padding. In this task, you are given the attached source code and data:
- Inverted_index_5000: This binary file contains the inverted index of the top 5000 most frequent keywords in the real Enron email dataset (https://www.cs.cmu.edu/~enron/).
- frequency_5000.csv: This .csv file details the (keyword, frequency) in the above index. This .csv file is already sorted in descending order.
- app.py and utilities.py. The current app.py script reads the inverted index into a Python dictionary data structure, namely inverted_dict. Then, using inverted_dict[keyword] will give you the list of document identifiers mapping to that keyword. You can review these two scripts for more details.
The first padding approach (Approach #1) is to add more padding (w,id’) pairs for each keyword w such that all the keywords have the same number of real and padding pairs, where id’ is a padding document identifier. In this way, all keywords will have the result length which is the same as the result length of the most frequent keyword when the server executes the search operation in the SSE scheme.
The second padding solution (Approach #2) is to add more padding (w,id’) pairs to w such that the query’s result length of w is a multiplication of an integer. For example,the current frequency of w is 6, then it needs 4 padding (w,id’) pairs if the multiplication of 5 is used.
The above two approaches can be found in [4].