Write My Paper Button

WhatsApp Widget

This Assessment 2 aims to assess your ability to apply statistical techniques used in data analytics to “real-life” financial data. You are expected to utilise the knowledge gained through your study material (week 7

Assignment Task

The purpose of this assessment

This Assessment 2 aims to assess your ability to apply statistical techniques used in data analytics to “real-life” financial data. You are expected to utilise the knowledge gained through your study material (week 7 to week 10 in 110.304). You are also encouraged to draw on your knowledge gained from other courses and any other information you have acquired from your own research and/or work experience to prepare an analysis and communicate it to various stakeholders.

Objectives

The assignment will also cover the course’s key generic competencies:

  • Develop a data analytics
  • Experience sourcing real raw data and applying extraction, transformation, and loading (ETL) techniques to prepare the data for
  • Improve your database problem-solving
  • Encounter real data problems and employ statistical tools and techniques learned in this class (and beyond) to solve them:
    • Descriptive analytics approach, distributions and sampling approach and diagnostic ap- proach for initial analysis;
    • Hypothesis testing for differences in groups, confidence interval estimations, analysis of variance and tests of statistical significance;
    • Regression analysis for hypothesis testing, including chi-square and non-parametric tests
  • Interpret the results and communicate to

Assessment 2 consists of a word report (excluding the reference list and Excel spreadsheet) and workings on an Excel worksheet.

Further details on presentation and marking guidelines are included in the Assessment 2 details below.

Introduction

This individual report explores financial data to discover insights of interest to various stakeholders. A divergence of interests between owners, debt-holders and corporate managers arises due to agency problems. Evidence suggests these managers have incentives to manage earnings for their own benefit, particularly when facing uncertainty, capital market and career-related pressures. Managers can manage earnings in two ways- (1) Discretionary accruals ( DACC ) and (2) Real earnings management ( REM1 & REM2 ).

1. Discretionary accruals ( DACC ) are accounting adjustments made by management to manipulate/manage financial They are considered discretionary because they are not necessary for the company’s day-to-day operations but are used to meet certain financial objectives or manage the appearance of the company’s financial performance.

Discretionary accruals can include adjustments to estimated expenses, such as allowances for bad debts or inventory write-downs, as well as adjustments to revenue recognition practices. These adjustments can be made to either inflate or deflate reported earnings, depending on the reporting entity’s goals.

While discretionary accruals can be legal, they can also be used to engage in financial fraud or to mislead investors. As a result, regulatory agencies such as the Australian Securities & Investments Commission (ASIC) closely monitor the use of discretionary accruals and require companies to disclose the extent to which they have been used in financial reporting.

2. Real earnings management ( REM ) uses operational or business activities to manipulate a company’s financial results. This differs from accrual-based earnings management, which involves manipulating accounting entries to manipulate financial

Real earnings management typically involves actions taken by management to either increase or decrease reported earnings through operational decisions. For example, a company may defer main- tenance on equipment to reduce expenses in the current period or accelerate sales to increase revenue in the current period. These actions are often taken in response to financial pressures, such as meeting earnings targets or increasing share prices.

While real earnings management is not necessarily illegal, it can be used to mislead investors and can be a sign of poor corporate governance. It is important for investors to analyse a company’s financial statements carefully and to look for signs of real earnings management, such as unusual patterns in revenue or expenses that do not align with the company’s underlying business operations. For this report, you will use two types of measures for real earnings management: (1) REM1 and (2) REM2

In Australia, the regulating body of the financial market, the Australian Securities & Investments Commis- sion (ASIC), monitors the possible managerial strategic earnings management. During the global pandemic (COVID-19), firms’ operations were significantly disrupted, and ASIC is concerned that firms may manage earnings to improve their financial position for the years 2020 and 2021. Where this is occurring, ASIC re- quires evidence, including the comparison of pre and post-pandemic financial position/ performance.

To identify pre-pandemic earnings management, ASIC hires you to apply data analytics techniques to monitor the major companies listed in the Australian Securities Exchange (ASX). In this report, you will explore the factors that may indicate firms’ earnings management using statistical techniques covered in this data analytics course. Your findings will provide insights to ASIC, investors and other stakeholders concerning the ongoing debate about firms’ earnings management.

Your analysis is expected to cover S&P/ASX200 listed companies from 2010 to 2019. This analysis will be carried out in two parts. In the first part, you will conduct ETL (Extraction, transformation and loading) along with basic analysis tasks using Excel. The second part involves the application of more advanced statistical analysis using Rstudio to generate results to answer various identified earnings-related issues/problems. The tasks below are presented in the sequence in which you are expected to perform them.

Required tasks

A. Perform ETL and initial analysis

1. Download three files containing 200 companies’ data in XLSX format and an Excel work file labelled ‘Working Excel File Assessment 2’ from the Stream site. You are required to complete all the steps for task 1 in the Working Excel File.

2. First, open the Working Excel file to combine all three company files into a single dataset using Power Query. While combining d ata in Power Query, you must also carry out the data transformation process using Power Query. The steps in Power Query that involve cleaning and transforming the data are as follows:

  • Combining/merging files for
  • Checking datasets to be “sufficiently different” in format/structure from one
  • Paying close attention to the state of the data and cleaning as necessary to improve the data quality and subsequent Common data cleaning procedures may involve:
    • Formatting the date to match the acceptable format for your Also, correct/exclude when encountering special characters (*, #, “, ’) or missing data. (e.g., apply the filter in Power Query)
    • Watching for invalid number formats when starting sorting. For example, accounting symbols such as dollar signs, commas, and parentheses are pervasive in CSV spread- sheet data (e.g., $12,345.22 or (1,422.53)).
    • Splitting column if it contains two different pieces of information in one. (e.g., Split column - By Delimiter)
    • Generating, excluding, renaming and labelling variables to prepare the data for anal-
    • Remove duplicate items based on suitable criteria (e.g., Company ID, year). (f) The last observation should be 1,999.

3. Once you are happy with the transformation process, save and load the data into the Working Excel file and copy the data to the worksheet named Final Data (and delete the old one!). Now, you will conduct some basic analysis using the final data as per the following instructions.

  • Using a pivot table, calculate the DACC , REM1 and REM2 by industry, year and company. (make sure you take the average value rather than the total)
  • Create a bar chart by industry and a scatter plot by the company for DACC , REM1 and REM2 .

4. Once you have finished with the initial analysis in Excel, then save the file and transfer graphs and initial results to the individual report’s first part as per the template and explain the following aspects of the initial findings:

  • Do the Bar chart and scatter plots initially reveal any potential earnings management*? By comparing scatter plots between DACC , REM1 and REM2 , can you identify one company that is significantly involved with downward earnings management?
  • Identify skewness of DACC , REM1 and REM2 .

B. Performing the test using statistical tools and analysing the results

Data Analytics approaches rely on a series of statistical techniques that are used to understand data and gain insights into the underlying causes and effects of business decisions. You will use Rstudio to perform the following statistical tests to generate results that you will use later for recommendations.

1. Create Table 2.1 and generate summary descriptive statistics for DACC, REM1, REM2, VISIBLE, SIZE, LEVERAGE, MTB, LOSS, CF, ZScore and BIG4 . Also, generate Table 2.2 and present summary statistics of mean value for all variables by industry using the GICS sector name.

2. Create Table 2.3 , present the estimated Pearson correlation matrix for all variables, and identify any possible multicollinearity.

3. Create a dummy variable to represent highly visible companies based on the level of news coverage. Use the average of the news coverage (e.g., 157.62) and label the dummy variable as VISIBLE = 1 if the company’s news coverage for that year is greater than average and VISIBLE = 0 otherwise.

  • Create Table 2.4 and present a subsample based on highly visible and less visible firms and perform summary statistics of all the variables in Table 1 for both

4. Estimate the simple regression model and present it in Table 2.5 ,

EM = β + β V ISIBLE + ϵ (1)

Where EM is either DACC , REM1 or REM2 . VISIBLE is a dummy variable that equals 1 if the firm’s yearly news coverage falls above the sample period’s average news coverage and 0 otherwise.

  • Explain the model’s goodness-of-fit. Test the statistical significance of the slope and intercept estimated at the 5% significance level. Write down the underlying hypothesis for each test, along with your (e.g., the expected association between EM and VISIBLE )
  • Draw a scatter plot of residuals against predicted values and
  • Compute the Durbin-Watson statistic and explain its