Category: Data science

  • Title: Comparing Means in Excel: Understanding the Significance of Differences Between Years

    You have been working in Excel to find descriptive statistics, now let’s explore how to compare means. You will need to utilize the Excel spreadsheet with your descriptives from the attached assignment. When comparing means, we would like to know if they are significantly different. Keeping in mind that your data in your spreadsheet gives three different groups of data, one for 2017, 2018, and 2019, simple comparisons of means procedures typically only cover two means. Running multiple tests is not ideal, but for our purposes, let’s examine the independent-t procedure. Simply put, in this case we have the following:
    Following the video instructions, perform an independent-t test on your data sets for 2017 – 2018 and a t-test on your data sets for 2018 – 2019. Be sure to review the p value of the tests
    Are there significant differences between these years? (*Note, run two separate t-tests)
    Be sure to label your output statistics on your worksheet so your instructor can identify the years you are comparing. 
    Imagine you are speaking to the same individual that you explained the descriptive statistics to in M1.3, again, assuming they cannot see the data and know nothing about statistics. Their question to you is, “So, there are differences between the years? How large are those differences really and what does that mean?” In a separate Word document, explain what t-tests are and how they work, and give the results of your two t-tests, explaining significance and what that means.
    Submit the completed Excel worksheet and Word document.

  • Title: Database Constraints and Normalization in Relational Databases

    Include the below numbering scheme in the submission. DO NOT include the Questions or other content from the instructions.
    Part 1) 
    Identify 4 database constraints (examples can include: Not NULL, primary key, unique, domain; there can be different ones as well) and data integrity, security, and availability issues related to relational databases. 
    Part 2) 
    HINT: Review the Normalization Process: Parts and Suppliers One-to-Many Example in the Terms and Concepts discussion.
    Background
    We want to keep track of the price we charge for each type of part, the supplier for each type of part, and the amount we pay the supplier for each type of part (the cost). Each supplier can provide us with many different types of parts, but each part can be provided to us by only one supplier. This requires a one-to-many relationship.
    Functional analysis:
    Part —- > Price, Cost, Supplier, Street, City, State, Zip, Telephone
    Supplier —- > Street, City, State, Zip, Telephone
    Part —- > Price, Cost, Supplier
    Table:
    Parts (Part, Price, Cost, Supplier, Street, City, State, Zip, Telephone)
    Compete the following:
    1. Include a definition of 2 normal form
    2. Is the above table in 2NF?
    3. Using the table and field names in the above table explain your answer.
    Assignment Guidelines
    Your submission must be original, include supporting sentences using the terms, concepts, and theories with the page number or website from the required readings or other material. Your submission should paraphrase material you reference, restrict your use of direct quotes (copy and paste) to less than 15% of the submission (the required copy and paste content will not be considered part of the 15% guideline).
    **Use open source references that do not require user login or registration to access. Please check through turnitin and AI generators for plagairism. Must be <15% similairty score**

  • Title: “Topic Analysis of Patent Data using LDA and PLSA Methodologies: A Comparative Study and Application of Coherence Score and Equation”

    I have an almost finished thesis, in which I have been asked to make some additions.
    Specifically, LDA and PLSA methodologies must be used for topic analysis in the data I am attaching,
    to run the algorithms for different numbers of topics (eg 2-30) and to derive a graph for coherence score (graph 1)
    and for a number with a good score, to analyze the topics for both algorithms. Also, I need the most
    influential Assignees and Inventors (according to the patent num cited by us patents feature).
    Finally, apply the equation (which I attach) for both methodologies, and compare the results.
    The rest of the work is done, no corrections or extra literature needed, just write code that runs
    the above (no need to submit code; results along with graphics and analysis are sufficient). I will complete chapter 5.2
    . The additions I need (and I have specified them in detail either in the comments on the filling form, or in a file I attached there) are exclusively for chapters 5, and 5.5 (LDA and PLSA). I estimate that the additions will have an area of ​​around 5 pages.

  • “Optimizing Parking Space Usage through Smart Technology: A Visualization Presentation for City Council”

    As a city manager for a mid-size city, you must be able to examine patterns and trends to highlight organizational performance and support organizational strategic planning. One of the ways that is done is through analyzing the statistics. Still, just presenting the numbers is not always the most efficient way to present your analysis.
    Scenario
    You are a city manager for a mid-size city that is anticipating increases in population and auto traffic as new industries move into the downtown area. Parking spaces are already hard to find and traffic congestion can be problematic from commuters and events like concerts and sports. Before considering adding additional parking, you think it is possible to use existing parking more effectively through a smart parking app that identifies parking space availability in different parking lots throughout the city in real time. 
    In this assessment, you will demonstrate your skill in information visualization when you present your recommendations to the City Council members who are responsible for deciding whether the city invests in resources to set in motion the smart parking space app. 
    Preparation
    Review the parking space usage file.
    Select any 2 parking lots. For each one, review the scatter plot showing the occupancy rate at each time stamp during the week of 11/20/2022 –11/26/2022. Identify whether occupancy rates are time dependent. If so, identify the times that seem to experience the highest occupancy rates. 
    Research “smart cities” to provide guidance and support for your presentation. 
    Assessment Deliverable
    Create a 10- to 12-slide information visualization presentation including voice-over or screencast video. Include the following in your presentation: 
    Outline the rationale and goals of the project. 
    Analyze the box plot charts showing the occupancy rates for each day of the week and interpret the results.
    Analyze the box plot charts showing the occupancy rates for each parking lot and interpret the results.
    Choose 2 scatter plot charts showing occupancy rate against the time of day over the course of the week and interpret the results. 
    Make a recommendation about continuing with the implementation of this project.
    Format references according to APA guidelines. 
    Submit your assessment.

  • “Analyzing Crime Rates and Police Expenditures in Southern and Northern States: A Statistical Analysis”

    Assignment Instuctions and data attached.
    In the first part of the project you will use the
    appropriate technique to use for each of the following questions.
    ·        
    Is
    there a significant difference in the mean crime rate for Southern states and
    the mean crime rate for Northern states? Use the original crime rates.  
    ·        
    Is
    there a significant change in crime rates over the ten years between the first
    and second data collections?
    ·        
    Was
    there a significant correlation between crime rate and police expenditures at
    the time of the first collection of data? To determine this, perform a simple
    linear regression for these variables including a scatterplot, r, r2,
    interpretation of r2 in the context of the problem, determine if
    slope is significant, and state what slope indicates in the context of the
    problem.
    ·        
    Was
    there a significant correlation between crime rate and police expenditures ten
    years later? This should be answered using the same guidelines as the previous
    question.
    Expectations:
    •   
    For
    each question you will determine the appropriate method of analysis and apply
    it with SPSS. All relevant output should be included in proper APA format.
    •   
    Your
    narrative should include the research question being addressed, the analysis
    method used to address it, the interpretation of the statistics, and references
    to tables and figures as appropriate.
    Material
    to submit
    •   
    Your
    finished work must be a Word document in APA format. Please include a cover
    page. A reference page will not be necessary.
    •   
    Each
    of the questions give above should be addressed in its own section.
    •    Include
    the syntax from each analysis in an appendix. All the syntax can be combined
    into one appendix.

  • “Optimizing Business Operations in 2024: A SIMUL8 Data Analysis Case Study”

    You must be proficient in using software SIMUL8. Use the software to solve a Data analysis/business analysis case.  The main body of the report (the executive summary) should not exceed 6 types of pages of A4 paper (11 point font, single spacing). You should also include a technical appendix with a maximum length of 8 pages.    Furthermore details are included in the 
    Simulation Project 2024.docx . A suitable bid will be accepted.

  • “Optimizing Business Operations through Data Analysis: A SIMUL8 Simulation Case Study”

    You must be proficient in using software SIMUL8. Use the software to solve a Data analysis/business analysis case.  
    The main body of the report (the executive summary) should not exceed 6 types of pages of A4 paper (11 point font, single spacing). You should also include a technical appendix with a maximum length of 8 pages.    Furthermore details are included in the Simulation Project 2024.docx . 
    A suitable bid will be accepted.

  • Functional Dependencies and Normalization in a Table

    Include the numbers on the table that is attached. DO NOT include the Questions or other content from these instructions. 
    1) Describe functional dependency ONLY; 
    – NOT full functional dependency, or partial dependency, or transitive dependency. 
    – Hint – review the Functional Dependencies Topic in the Terms and Concepts discussion. While you may incorporate the formal definition, you must explain the concept in your own words, using fields names and values from this exercise. 
    Identify the functional dependencies that exist in the table attached. All attributes should be included at least once. There may be more than one row of functional notation needed. 
    2) Identify a primary key for the table attached. 
    Indicate whether there are any alternate keys (for this table) 
    explain each of the above choices.
    3) Is the table in 3NF? 
    If not, explain why – (provide specific rationale, use field names and values in the table to demonstrate your understanding). 
    Explain what normal form the table provided is in. 
    4) Once you have determined the format for 3NF, provide the create table commands to create the tables as if you were to create the tables in SQLite or MS ACCESS.
    APA guidelines, spelling, grammar, file name 
    Please check against AI and Turnitin for plagarism.
    **Turnitin report must have (less than) <15% similarity score. Please use open source references that do not require user login or registration to access.**  Course Hero may not be used as reference.

  • Title: “Improving Selection Process and Reconsidering the Use of BMD as a Detector of Osteoporotic Fractures: A Discussion of the Lai et al. Study” Overall Reaction: The Lai et al.

    For this discussion, read the article below (Lai, et al., 2019). Begin your discussion with an overall reaction to the study and then respond to the following
    1. On p. 952, the authors state that selection bias may exist in their study. How could the selection process have been improved upon in their study to reduce its likelihood?
    2. Is there enough evidence from this study to suggest the discontinuance of using bone mineral density (BMD) as a detector of osteoporotic fractures?

  • “Revolutionizing Credit Risk Assessment and Modelling: The Role of Alternative Data Sources and Machine Learning in Improving Lending Practices in Mauritius”

    The research
    will address the following questions and hypotheses:
    Question 1: Can
    alternative data sources, such as social media activity and transaction
    history, augment the accuracy of credit risk assessment compared to
    traditional credit scoring systems?
    Question 2: What
    machine learning algorithms, when applied to credit risk modelling,
    proffer
    the highest predictive accuracy and efficiency?
    Question 3: How
    can the transparency and fairness of credit risk models be improved to
    provide actionable insights for lending institutions?
    Through an
    in-depth investigation of these questions, this research aims to stipulate valuable
    insights and solutions to the financial industry, ultimately bettering credit
    risk modelling and
    lending practices
    in Mauritius.