BEST代写-线上编程学术专家

Best代写-最专业靠谱代写IT | CS | 留学生作业 | 编程代写Java | Python |C/C++ | PHP | Matlab | Assignment Project Homework代写

Python代写|Project: Extract Information From EDGAR Platform

Python代写|Project: Extract Information From EDGAR Platform

这是一个澳洲的Python作业代写,主要与数据提取相关

1. General instructions

1.1. Define year and quarter, and visit edgar index via

https://www.sec.gov/Archives/edgar/full-index/%s/QTR%s/company.idx
where %s = year; QTR%s = quarter

e.g. All files filed in 2022 QTR1 are listed in
https://www.sec.gov/Archives/edgar/full-index/2022/QTR1/company.idx

It looks like in Figure 1.

Figure 1.

1.2. Filter rows by “Form Type” (defined later in Section 2 – 4) and access the text file by concatenating
strings to build URL = “https://www.sec.gov/Archives/” + File Name.

1.3. Extract information from each filtered text file and write results into csv files per year-quarter.

2. Task A – Specific instructions

▪ Select rows where “Form Type” = “485APOS” or “N-1A” or “N-1A/A”. Though those three form
types are called differently in the company.idx, their filing follow very similar template.

▪ Write selected rows into csv: “A” + “_” + year + “_” + quarter.csv, e.g. A_2011_3.csv,
A_2011_4.csv. Column names are the same as company.idx.

▪ Write unselected rows into csv: “A” + “_” + year + “_” + quarter + “_” + “rej”.csv, e.g.
A_2011_3_rej.csv, A_2011_4_rej.csv. Column names are the same as company.idx.

▪ Read each selected text file and extract information for columns defined in Table 1

▪ Write extract results into csv: “A” + “_” + year + “_” + quarter+ “_” + “result”.csv, e.g.
A_2011_3_result.csv, A_2011_4_result.csv

▪ Enter “N99999A” for missing observations.

3. Task B – Specifics instructions

▪ Select rows where “Form Type” = “485APOS” or “N-1A” or “N-1A/A”. Though those three form
types are called differently in the company.idx, their filing follow very similar template.

▪ Write selected rows into csv: “B” + “_” + year + “_” + quarter.csv, e.g. B_2011_3.csv,
B_2011_4.csv. Column names are the same as company.idx

▪ Write unselected rows into csv: “B” + “_” + year + “_” + quarter + “_” + “rej”.csv, e.g.
B_2011_3_rej.csv, B_2011_4_rej.csv. Column names are the same as company.idx

▪ Read each selected text file and extract information for columns defined in Table 2

▪ Write extract results into csv: “B” + “_” + year + “_” + quarter+ “_” + “result”.csv, e.g.
B_2011_3_result.csv, B_2011_4_result.csv

▪ Enter “N99999A” for missing observations.

Note that, in spelling names, EDGAR filing may use special characters and their codes interchangeably.

For example, “Global X Farmland & Timberland ETF” and “Global X Farmland & Timberland ETF”
are both used in the same file.

bestdaixie

评论已关闭。