{"appState":{"pageLoadApiCallsStatus":true},"categoryState":{"relatedCategories":{"headers":{"timestamp":"2022-05-17T12:31:15+00:00"},"categoryId":33577,"data":{"title":"Data Science","slug":"data-science","image":{"src":null,"width":0,"height":0},"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577}],"parentCategory":{"categoryId":33572,"title":"Information Technology","slug":"information-technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"}},"childCategories":[{"categoryId":33578,"title":"Big Data","slug":"big-data","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33578"},"image":{"src":"/img/background-image-2.fabfbd5c.png","width":0,"height":0}},{"categoryId":33579,"title":"Databases","slug":"databases","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33579"},"image":{"src":"/img/background-image-1.daf74cf0.png","width":0,"height":0}},{"categoryId":33580,"title":"General (Data Science)","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"image":{"src":"/img/background-image-2.fabfbd5c.png","width":0,"height":0}},{"categoryId":34365,"title":"Web Analytics","slug":"web-analytics","_links":{"self":"https://dummies-api.dummies.com/v2/categories/34365"},"image":{"src":"/img/background-image-1.daf74cf0.png","width":0,"height":0}}],"description":"Data science is what happens when you let the world's brightest minds loose on a big dataset. It gets crazy. Our articles will walk you through what data science is and what it does.","relatedArticles":{"self":"https://dummies-api.dummies.com/v2/articles?category=33577&offset=0&size=5"}},"_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"}},"relatedCategoriesLoadedStatus":"success"},"listState":{"list":{"count":10,"total":365,"items":[{"headers":{"creationTime":"2016-03-27T16:48:16+00:00","modifiedTime":"2022-04-27T21:14:13+00:00","timestamp":"2022-04-28T00:01:06+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General (Data Science)","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"Predictive Analytics For Dummies Cheat Sheet","strippedTitle":"predictive analytics for dummies cheat sheet","slug":"predictive-analytics-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"These handy predictive analytics tips and checklists will help keep your project on the rails and out of the woods.","noIndex":0,"noFollow":0},"content":"A predictive analytics project combines execution of details with big-picture thinking. These handy tips and checklists will help keep your project on the rails and out of the woods.","description":"A predictive analytics project combines execution of details with big-picture thinking. These handy tips and checklists will help keep your project on the rails and out of the woods.","blurb":"","authors":[{"authorId":9445,"name":"Anasse Bari","slug":"anasse-bari","description":"Anasse Bari, PhD, is a data science expert and university professor with many years of predictive modeling and data analytics experience. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9445"}},{"authorId":9446,"name":"Mohamed Chaouchi","slug":"mohamed-chaouchi","description":"Mohamed Chaouchi is a veteran software engineer who has conducted extensive research using data mining methods. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9446"}},{"authorId":9447,"name":"Tommy Jung","slug":"tommy-jung","description":"Tommy Jung is a software engineer with expertise in enterprise web applications and analytics. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9447"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General (Data Science)","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[{"articleId":229559,"title":"Predictive Analytics: Knowing When to Update Your Model","slug":"predictive-analytics-knowing-update-model","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/229559"}},{"articleId":229556,"title":"Tips for Building Deployable Models for Predictive Analytics","slug":"tips-building-deployable-models-predictive-analytics","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/229556"}},{"articleId":229553,"title":"Using Relevant Data for Predictive Analytics: Avoid “Garbage In, Garbage Out”","slug":"using-relevant-data-predictive-analytics-avoid-garbage-garbage","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/229553"}},{"articleId":229550,"title":"How to Build a Predictive Analytics Team","slug":"build-predictive-analytics-team","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/229550"}},{"articleId":229544,"title":"Enterprise Architecture for Big Data","slug":"enterprise-architecture-big-data","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/229544"}}],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281827,"slug":"predictive-analytics-for-dummies-2nd-edition","isbn":"9781119267003","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119267005/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119267005/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119267005-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119267005/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119267005/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/predictive-analytics-for-dummies-2nd-edition-cover-9781119267003-201x255.jpg","width":201,"height":255},"title":"Predictive Analytics For Dummies, 2nd Edition","testBankPinActivationLink":"","bookOutOfPrint":false,"authorsInfo":"\n <p><b data-author-id=\"10987\">Anasse Bari, Ph.D. </b>is data science expert and a university professor who has many years of predictive modeling and data analytics experience.</p> <p><b data-author-id=\"9446\">Mohamed Chaouchi </b>is a veteran software engineer who has conducted extensive research using data mining methods. </p>\n<p><b data-author-id=\"9447\">Tommy Jung</b> is a software engineer with expertise in enterprise web applications and analytics. </p>","authors":[{"authorId":10987,"name":"Dr. Anasse Bari","slug":"dr-anasse-bari","description":"","_links":{"self":"https://dummies-api.dummies.com/v2/authors/10987"}},{"authorId":9446,"name":"Mohamed Chaouchi","slug":"mohamed-chaouchi","description":"Mohamed Chaouchi is a veteran software engineer who has conducted extensive research using data mining methods. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9446"}},{"authorId":9447,"name":"Tommy Jung","slug":"tommy-jung","description":"Tommy Jung is a software engineer with expertise in enterprise web applications and analytics. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9447"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119267003&quot;]}]\" id=\"du-slot-6269d942c346f\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119267003&quot;]}]\" id=\"du-slot-6269d942c3ddd\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":155457,"title":"Building a Predictive Analytics Model","slug":"building-a-predictive-analytics-model","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/155457"}},{"articleId":155445,"title":"Data Sources for Predictive Analytics Projects","slug":"data-sources-for-predictive-analytics-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/155445"}},{"articleId":155403,"title":"Ensuring Success When Using Predictive Analytics","slug":"ensuring-success-when-using-predictive-analytics","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/155403"}}],"content":[{"title":"Building a predictive analytics model","thumb":null,"image":null,"content":"<p>A successful predictive analytics project is executed step by step. As you immerse yourself in the details of the project, watch for these major milestones:</p>\n<ol class=\"level-one\">\n<li>\n<p class=\"first-para\">Defining Business Objectives</p>\n<p class=\"child-para\">The project starts with using a well-defined business objective. The model is supposed to address a business question. Clearly stating that objective will allow you to define the scope of your project, and will provide you with the exact test to measure its success.</p>\n</li>\n<li>\n<p class=\"first-para\">Preparing Data</p>\n<p class=\"child-para\">You’ll use historical data to train your model. The data is usually scattered across multiple sources and may require cleansing and preparation. Data may contain duplicate records and outliers; depending on the analysis and the business objective, you decide whether to keep or remove them. Also, the data could have missing values, may need to undergo some transformation, and may be used to generate derived attributes that have more predictive power for your objective. Overall, the quality of the data indicates the quality of the model.</p>\n</li>\n<li>\n<p class=\"first-para\">Sampling Your Data</p>\n<p class=\"child-para\">You’ll need to split your data into two sets: training and test datasets. You build the model using the training dataset. You use the test data set to verify the accuracy of the model’s output. Doing so is absolutely crucial. Otherwise you run the risk of <i>overfitting</i> your model — training the model with a limited dataset, to the point that it picks all the characteristics (both the signal and the noise) that are only true for that particular dataset. An model that’s overfitted for a specific data set will perform miserably when you run it on other datasets. A test dataset ensures a valid way to accurately measure your model’s performance.</p>\n</li>\n<li>\n<p class=\"first-para\">Building the Model</p>\n<p class=\"child-para\">Sometimes the data or the business objectives lend themselves to a specific algorithm or model. Other times the best approach is not so clear-cut. As you explore the data, run as many algorithms as you can; compare their outputs. Base your choice of the final model on the overall results. Sometimes you’re better off running an ensemble of models simultaneously on the data and choosing a final model by comparing their outputs.</p>\n</li>\n<li>\n<p class=\"first-para\">Deploying the Model</p>\n<p class=\"child-para\">After building the model, you have to deploy it in order to reap its benefits. That process may require co-ordination with other departments. Aim at building a deployable model. Also be sure you know how to present your results to the business stakeholders in an understandable and convincing way so they adopt your model. After the model is deployed, you’ll need to monitor its performance and continue improving it. Most models decay after a certain period of time. Keep your model up to date by refreshing it with newly available data.</p>\n</li>\n</ol>\n"},{"title":"Data sources for predictive analytics projects","thumb":null,"image":null,"content":"<p>Data for a predictive analytics project can come from many different sources. Some of the most common sources are within your own organization; other common sources include data purchased from outside vendors.</p>\n<p>Internal data sources include</p>\n<ul class=\"level-one\">\n<li>\n<p class=\"first-para\">Transactional data, such as customer purchases</p>\n</li>\n<li>\n<p class=\"first-para\">Customer profiles, such as user-entered information from registration forms</p>\n</li>\n<li>\n<p class=\"first-para\">Campaign histories, including whether customers responded to advertisements</p>\n</li>\n<li>\n<p class=\"first-para\">Clickstream data, including the patterns of customers’ web clicks</p>\n</li>\n<li>\n<p class=\"first-para\">Customer interactions, such as those from e-mails, chats, surveys, and customer-service calls</p>\n</li>\n<li>\n<p class=\"first-para\">Machine-generated data, such as that from telematics, sensors, and smart meters</p>\n</li>\n</ul>\n<p>External data sources include</p>\n<ul class=\"level-one\">\n<li>\n<p class=\"first-para\">Social media such as Facebook, Twitter, and LinkedIn</p>\n</li>\n<li>\n<p class=\"first-para\">Subscription services such as Bloomberg, Thompson Reuters, Esri, and Westlaw</p>\n</li>\n</ul>\n<p>By combining data from several disparate data sources in your predictive models, you may get a better overall view of your customer, thus a more accurate model.</p>\n"},{"title":"Ensuring success when using predictive analytics","thumb":null,"image":null,"content":"<p>Think of predictive analytics as a bright bulb powered by your data. The light (insight) from predictive analytics can empower your strategy, streamline your operations, and improve your bottom line. The followings four recommendations can help you ensure success for your predictive analytics initiatives.</p>\n<h2>Foster a culture of change</h2>\n<p>Predictive analytics should be adopted across the organization as a whole. The organization should embrace change. Business stakeholders should be ready to incorporate recommendations and adopt findings derived from the predictive analytics projects. The outcomes of a predictive analytics projects are only valuable if the business leaders are willing to act on them.</p>\n<h2>Create a data-science team</h2>\n<p>Hire a data-science team whose sole job is to establish and support your predictive analytics solutions. This team of talented professionals— comprising business analysts, data scientists, and information technologists — is better equipped to work on the project full-time. Including a range of professional backgrounds can bring valuable insights to the team from other domains. Selecting team members from different departments in your organization can help ensure a widespread buy-in.</p>\n<h2>Use visualization tools effectively</h2>\n<p>Visualization is a powerful way to conveying complex ideas efficiently. Using visualization effectively can help you initially explore and understand the data you’re working with. Visual aids such as charts can also help you evaluate the model’s output or compare the performance of predictive models.</p>\n<h2>Use predictive analytics tools</h2>\n<p>Powerful predictive analytics tools are available as software packages in the marketplace. They’re designed to make the whole process a lot easier. Without the use of such tools, building a model from scratch quickly becomes time-intensive. Using a good predictive analytics tool enables you to run multiple scenarios and instantaneously compare the results — all with a few clicks. A tool can quickly automate many of time-consuming steps required to build and evaluate one or more models.</p>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2022-04-27T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":207733},{"headers":{"creationTime":"2019-12-22T23:12:06+00:00","modifiedTime":"2022-04-25T20:57:28+00:00","timestamp":"2022-04-26T00:01:11+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General (Data Science)","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"Data Science Programming All-in-One For Dummies Cheat Sheet","strippedTitle":"data science programming all-in-one for dummies cheat sheet","slug":"data-science-programming-all-in-one-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"This cheat sheet will help you use data more effectively. Learn about types of data and how to choose the right programming language.","noIndex":0,"noFollow":0},"content":"Data science affects many <a href=\"https://www.dummies.com/programming/big-data/data-science/data-science-techniques-you-can-use-for-successful-change-management/\" target=\"_blank\" rel=\"noopener\">different technologies</a> in a profound manner. Our society runs on data today, so you can’t do many things that aren’t affected by it in some way. Even the timing of stoplights depends on data collected by the highway department. Your food shopping experience depends on data collected from Point of Sale (POS) terminals, surveys, farming data, and sources you can’t even begin to imagine.\r\n\r\nNo matter how you use data, this cheat sheet will help you use it more effectively.\r\n\r\n[caption id=\"attachment_266848\" align=\"alignnone\" width=\"556\"]<img class=\"size-full wp-image-266848\" src=\"https://www.dummies.com/wp-content/uploads/data-science-programming-code.jpg\" alt=\"Concept of data science programming code\" width=\"556\" height=\"344\" /> ©carlos castilla/Shutterstock.com[/caption]","description":"Data science affects many <a href=\"https://www.dummies.com/programming/big-data/data-science/data-science-techniques-you-can-use-for-successful-change-management/\" target=\"_blank\" rel=\"noopener\">different technologies</a> in a profound manner. Our society runs on data today, so you can’t do many things that aren’t affected by it in some way. Even the timing of stoplights depends on data collected by the highway department. Your food shopping experience depends on data collected from Point of Sale (POS) terminals, surveys, farming data, and sources you can’t even begin to imagine.\r\n\r\nNo matter how you use data, this cheat sheet will help you use it more effectively.\r\n\r\n[caption id=\"attachment_266848\" align=\"alignnone\" width=\"556\"]<img class=\"size-full wp-image-266848\" src=\"https://www.dummies.com/wp-content/uploads/data-science-programming-code.jpg\" alt=\"Concept of data science programming code\" width=\"556\" height=\"344\" /> ©carlos castilla/Shutterstock.com[/caption]","blurb":"","authors":[{"authorId":9109,"name":"John Paul Mueller","slug":"john-paul-mueller","description":"John Paul Mueller has written more than 100 books and more than 600 articles on topics ranging from functional programming techniques to application development using C++. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9109"}},{"authorId":9110,"name":"Luca Massaron","slug":"luca-massaron","description":"Luca Massaron is a Google developer expert in machine learning. Massaron is a data scientist and marketing research director specializing in multivariate statistical analysis, machine learning, and customer insight.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9110"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General (Data Science)","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[{"articleId":268328,"title":"Linear Regression vs. Logistic Regression","slug":"linear-regression-vs-logistic-regression","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/268328"}},{"articleId":268303,"title":"How Data is Collected and Why It Can Be Problematic","slug":"how-data-is-collected-and-why-it-can-be-problematic","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/268303"}},{"articleId":268298,"title":"How to Perform Pattern Matching in Python","slug":"how-to-perform-pattern-matching-in-python","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/268298"}},{"articleId":268293,"title":"How Pattern Matching Works in Data Science","slug":"how-pattern-matching-works-in-data-science","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/268293"}},{"articleId":268288,"title":"The Need for Reliable Sources in Data Science Applications","slug":"the-need-for-reliable-sources-in-data-science-applications","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/268288"}}],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281678,"slug":"data-science-programming-all-in-one-for-dummies-2","isbn":"9781119626114","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119626110/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119626110/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119626110-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119626110/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119626110/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/data-science-programming-all-in-one-for-dummies-cover-9781119626114-203x255.jpg","width":203,"height":255},"title":"Data Science Programming All-in-One For Dummies","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"\n <p><b data-author-id=\"9109\">John Paul Mueller</b> has written more than 100 books and more than 600 articles on topics ranging from functional programming techniques to application development using C++. <b data-author-id=\"9110\">Luca Massaron</b> is a Google developer expert in machine learning. Massaron is a data scientist and marketing research director specializing in multivariate statistical analysis, machine learning, and customer insight.</p>","authors":[{"authorId":9109,"name":"John Paul Mueller","slug":"john-paul-mueller","description":"John Paul Mueller has written more than 100 books and more than 600 articles on topics ranging from functional programming techniques to application development using C++. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9109"}},{"authorId":9110,"name":"Luca Massaron","slug":"luca-massaron","description":"Luca Massaron is a Google developer expert in machine learning. Massaron is a data scientist and marketing research director specializing in multivariate statistical analysis, machine learning, and customer insight.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9110"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119626114&quot;]}]\" id=\"du-slot-62673647a3d23\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119626114&quot;]}]\" id=\"du-slot-62673647a46d2\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":0,"title":"","slug":null,"categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/"}}],"content":[{"title":"Common Forms of Errant Data","thumb":null,"image":null,"content":"<p>Data becomes less useful or possibly not useful at all when it fails to meet specific needs, such as correctness. For many people, error equates to wrong. However, in many cases, data is correct, yet also erroneous. A sales statistic may reflect reality for a particular group, but if that group isn’t part of your analysis, the data is incorrect for your needs despite being correct data. You can consider data errant when it meets any of these criteria:</p>\n<ul>\n<li><strong>Incorrect:</strong> The data is actually wrong in some way.</li>\n<li><strong>Missing:</strong> The data isn’t there to use, such as a field that someone didn’t fill out in a form.</li>\n<li><strong>Wrong type:</strong> The data appears in a form that won’t work for your needs, such as numeric data that appears as a string rather than a number.</li>\n<li><strong>Malformatted:</strong> The data is in the correct form, but isn’t formatted correctly, such as when you receive an older form of a state name, such as Wis rather than the necessary two-character form, WI. Often, this errant data occurs because of a misunderstanding or the use of an outdated standard.</li>\n<li><strong>Wrong format for the task:</strong> The data is correct in every possible way except for being in task-specific format. For example, a date could appear in MM/DD/YY form when you need it in DD/MM/YY form. Because some dates, such as January 1, can be correct in other formats, this particular form of errant data is hard to track down.</li>\n<li><strong>Incomplete:</strong> The data is correct to an extent, but something is missing. For example, you might need a four-digit year, but you receive only a two-digit year instead.</li>\n<li><strong>Imprecise:</strong> The data isn’t sufficiently accurate for your task, such as when you receive an integer value in place of a floating-point value.</li>\n<li><strong>Misaligned:</strong> The data is correct, is in the right form, and is even of the right precision, but it still won’t parse because of some type of shifting. For example, when working with text, perhaps two spaces appear after each period, rather than one space. A number field on a form might not convert to a number because someone added a space in the entry.</li>\n<li><strong>Outdated:</strong> Data gets old just like anything else. Using old data will cause problems with your analysis unless you’re looking at it for historical purposes.</li>\n<li><strong>Opinion, rather than fact:</strong> A fact is verifiable through some type of process and vetted through peer review. An opinion can inform or enlighten, but it won’t help your analysis.</li>\n<li><strong>Misclassified:</strong> The data is useful in every possible way, except that it’s the wrong information. For example, you might find yourself using statistics on cat physiology when your intention was to research dogs.</li>\n</ul>\n"},{"title":"Missing Data","thumb":null,"image":null,"content":"<p>Missing data will tend to skew or bias the results of any analysis you perform using it. Consequently, you must find a way to deal with the missing data or face the fact that your analysis will contain flaws. You have a few possible strategies to handle missing data effectively. Your strategy may change if you have to handle missing values of these types:</p>\n<ul>\n<li><strong>Quantitative values:</strong> Data values expressed as numbers.</li>\n<li><strong>Qualitative features:</strong> Data that refers to concepts. Even though you express them as numbers, their values are somewhat arbitrary, and you cannot meaningfully take an average or other computations on them.</li>\n</ul>\n<p>When working with qualitative features, your value guessing should always produce integer numbers, based on the numbers used as codes. Here are common strategies for missing data handling:</p>\n<ul>\n<li><strong>Replace missing values with a computed constant such as the mean or the median value.</strong> If your feature is a category, you must provide a specific value because the numbering is arbitrary, and using mean or median doesn’t make sense. Use this strategy when the missing values are random.</li>\n<li><strong>Replace missing values with a value outside the normal value range of the feature.</strong> For instance, if the feature is positive, replace missing values with negative values. This approach works fine with decision tree–based algorithms and qualitative variables.</li>\n<li><strong>Replace missing values with 0, which works well with regression models and standardized variables.</strong> This approach is also applicable to qualitative variables when they contain binary values.</li>\n<li><strong>Interpolate the missing values when they are part of a series of values tied to time.</strong> This approach works only for quantitative values. For instance, if your feature is daily sales, you could use a moving average of the last seven days or pick the value at the same time as the previous week.</li>\n<li><strong>Impute their value using the information from other predictor features (but never use the response variable).</strong> Particularly in R, specialized libraries like <a href=\"https://cran.r-project.org/web/packages/missForest/index.html\">missForest</a>, <a href=\"https://cran.r-project.org/web/packages/mice/index.html\">MICE</a> and <a href=\"https://gking.harvard.edu/amelia\">Amelia II</a> can do everything for you. <a href=\"https://scikit-learn.org/stable/modules/generated/sklearn.impute.IterativeImputer.html\">Scikit-learn</a> recently introduced an experimental missing values imputer that allows imputing data in Python using Multivariate Imputation by Chained Equations (MICE), missForest, or even Amelia methodologies.</li>\n</ul>\n<p>Another good practice is to create a new binary feature for each variable whose values you repaired. The binary variable will track variations that result from replacement or imputing with a positive value, and your machine learning algorithm can figure out when it must make additional adjustments to the values you actually used.</p>\n"},{"title":"Consider Novelty Data Types","thumb":null,"image":null,"content":"<p>Experience teaches that the world is rarely stable. Sometimes novelties do naturally appear because the world is so mutable. Consequently, your data changes over time in unexpected ways, in both target and predictor variables. A <em>target</em> variable is the variable you want to know more about, and the <em>predictor</em> variable is the independent variable used to predict the target variable. This phenomenon is called <em>concept drift.</em> The term <em>concept</em> refers to your target and <em>drift</em> to the source data used to perform a prediction that moves in a slow but uncontrollable way, like a boat drifting because of strong tides.</p>\n<p>To obtain a relevant and useful analysis, you check for any new data containing anomalies with respect to existing cases. Maybe you spent a lot of time cleaning your data or you developed a machine learning application based on available data, so it would be critical to figure out whether the new data is similar to the old data and whether the algorithms will continue working well in classification or prediction.</p>\n<p>In such cases, data scientists talk of novelty detection, because they need to know how well the new data resembles the old. Being exceptionally new is considered an anomaly: Novelty may conceal a significant event or may risk preventing an algorithm from working properly because tasks such as machine learning rely heavily on learning from past examples, and the algorithm may not generalize to completely novel cases. When working with new data, you should retrain the algorithm. When considering a data science model, you distinguish between different concept drift and novelty situations using these criteria:</p>\n<ul>\n<li><strong>Physical:</strong> Face or voice recognition systems, or even climate models, never really change. Don’t expect novelties, but check for outliers that result from data problems, such as erroneous measurements.</li>\n<li><strong>Political and economic:</strong> These models sometimes change, especially in the long run. You have to keep an eye out for long-term effects that start slowly and then propagate and consolidate, rendering your models ineffective.</li>\n<li><strong>Social behavior:</strong> Social networks and the language you use every day change over time. Expect novelties to appear and take precautionary steps; otherwise, your model will suddenly deteriorate and turn unusable.</li>\n<li><strong>Search engine data, banking, and e-commerce fraud schemes:</strong> These models change quite often. You need to exercise extra care in checking for the appearance of novelties, which tell you it’s time to train a new model to maintain accuracy.</li>\n<li><strong>Cyber security threats and advertising trends:</strong> These models change continuously. Spotting novelties is the norm, and reusing the same models over a long time is a hazard.</li>\n</ul>\n"},{"title":"Choose the Correct Programming Language","thumb":null,"image":null,"content":"<p>Data scientists usually use only a few languages because they make working with data easier. The world holds many different programming languages, and most are designed to perform tasks in a certain way or even make a particular profession’s work easier to do. Choosing the correct tool makes your life easier. Using the wrong tool is akin to using a hammer rather than a screwdriver to drive a screw. Yes, the hammer works, but the screwdriver is much easier to use and definitely does a better job. Here are the top languages for data science work in order of preference:</p>\n<ul>\n<li><strong>Python (general purpose):</strong> Many data scientists prefer to use Python because it provides a wealth of libraries, such as NumPy, SciPy, MatPlotLib, pandas, and Scikit-learn, to make data science tasks significantly easier. Python is also a precise language that makes using multiprocessing on large datasets easy, reducing the time required to analyze them. The data science community has also stepped up with specialized IDEs, such as Anaconda, that implement the Jupyter Notebook concept, which makes working with data science calculations significantly easier. Besides all of these things in Python’s favor, it’s also an excellent language for creating glue code with languages such as C/C++ and Fortran. The Python documentation shows how to create the required extensions. Most Python users rely on the language to see patterns, such as allowing a robot to see a group of pixels as an object. Python also sees use for all sorts of scientific tasks.</li>\n<li><strong>R (special purpose statistical):</strong> In many respects, Python and R share the same sorts of functionality, but they implement it in different ways. Depending on which source you view, Python and R have about the same number of proponents, and some people use Python and R interchangeably (or sometimes in tandem). Unlike Python, R provides its own environment, so you don’t need a third-party product such as Anaconda. However, you can also use third party IDEs such as Jupyter Notebook so that you can use a single IDE for all your needs. Unfortunately, R doesn’t appear to mix with other languages with the ease that Python provides.</li>\n<li><strong>SQL (database management):</strong> The most important thing to remember about Structured Query Language (SQL) is that it focuses on data rather than tasks. Businesses can’t operate without good data management — the data is the business. Large organizations use some sort of relational database, which is normally accessible with SQL, to store their data. Most Database Management System (DBMS) products rely on SQL as their main language, and the DBMS usually has a large number of data analysis and other data science features built in. Because you’re accessing the data natively, you often experience a significant speed gain in performing data science tasks this way. Database Administrators (DBAs) generally use SQL to manage or manipulate the data rather than necessarily perform detailed analysis of it. However, the data scientist can also use SQL for various data science tasks and make the resulting scripts available to the DBAs for their needs.</li>\n<li><strong>Java (general purpose):</strong> Some data scientists perform other kinds of programming that require a general-purpose, widely adapted, and popular, language. In addition to providing access to a large number of libraries (most of which aren’t actually all that useful for data science but do work for other needs), Java supports object orientation better than any of the other languages in this list. In addition, it’s strongly typed and tends to run quite quickly. Consequently, some people prefer it for finalized code. Java isn’t a good choice for experimentation or ad hoc queries. Oddly enough, an implementation of Java for Jupyter Notebook exists, but it isn’t refined and is not usable for data science work at this time.</li>\n</ul>\n<p style=\"padding-left: 40px;\">One thing to note about Java is that Microsoft is taking a significantly stronger interest in the language and that may spell some changes in the future. See these articles: <a href=\"https://www.pcworld.com/article/2049415/microsoft-sails-past-oracle-in-bringing-java-se-to-the-cloud.html\">Microsoft sails past Oracle in bringing Java SE to the cloud</a>, <a href=\"https://devblogs.microsoft.com/visualstudio/java-on-visual-studio-code-october-update/\">Java on Visual Studio Code October Update</a>, <a href=\"https://jaxenter.com/jax-london-2019-begun-microsoft-now-java-shop-162790.html\">JAX London 2019 has begun: “Microsoft is now a Java shop”</a>, and <a href=\"https://www.javaoffheap.com/category/microsoft\">Episode 48. On Jakarta EE 9 Band-aids, OracleCodeOne Debrief, Unionizing Tech, IBM vs Microsoft and Oracle JDBC Drivers!</a> for some ideas on changes that could take place.</p>\n<ul>\n<li><strong>Scala (general purpose):</strong> Because Scala uses the Java Virtual Machine (JVM), it does have some of the advantages and disadvantages of Java. However, like Python, Scala provides strong support for the functional programming paradigm, which uses lambda calculus as its basis (see <em>Functional Programmming For Dummies,</em> by John Mueller [Wiley] for details). In addition, Apache Spark is written in Scala, which means that you have good support for cluster computing when using this language — think huge dataset support. Some of the pitfalls of using Scala are that it’s hard to set up correctly, it has a steep learning curve, and it lacks a comprehensive set of data science specific libraries.</li>\n</ul>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"Six months","lifeExpectancySetFrom":"2022-04-25T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":266847},{"headers":{"creationTime":"2019-06-04T16:39:57+00:00","modifiedTime":"2022-04-15T14:02:57+00:00","timestamp":"2022-04-15T18:01:05+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General (Data Science)","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"Data Science Strategy For Dummies Cheat Sheet","strippedTitle":"data science strategy for dummies cheat sheet","slug":"data-science-strategy-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"Modern business is all about the data. Use this guide from Dummies.com to discover how to begin your data science strategy.","noIndex":0,"noFollow":0},"content":"A revolutionary change is taking place in society and it involves data science. Everybody from small local companies to global enterprises is starting to realize the potential of <a href=\"https://www.dummies.com/programming/big-data/data-science/what-is-data-science/\" target=\"_blank\" rel=\"noopener\">data science</a> and is seeing the value in digitizing their data assets and becoming data driven. Regardless of industry, companies have embarked on a similar journey to explore how to drive new business value by utilizing analytics, <a href=\"https://www.dummies.com/programming/learning-process-works-machine-learning/\" target=\"_blank\" rel=\"noopener\">machine learning</a> (ML), and artificial intelligence (AI) techniques and introducing data science as a new discipline.\r\n\r\nHowever, although utilizing these new technologies will help companies simplify their operations and drive down costs, nothing is simple about getting the strategic approach right for your data science investment. This cheat sheet gives you a peak at the fundamental concepts you need to be on top of when building your data science strategy. It looks not only at investing in a top performing data science team, but also what to consider in your data architecture and how to approach the commercial aspects of data science.","description":"A revolutionary change is taking place in society and it involves data science. Everybody from small local companies to global enterprises is starting to realize the potential of <a href=\"https://www.dummies.com/programming/big-data/data-science/what-is-data-science/\" target=\"_blank\" rel=\"noopener\">data science</a> and is seeing the value in digitizing their data assets and becoming data driven. Regardless of industry, companies have embarked on a similar journey to explore how to drive new business value by utilizing analytics, <a href=\"https://www.dummies.com/programming/learning-process-works-machine-learning/\" target=\"_blank\" rel=\"noopener\">machine learning</a> (ML), and artificial intelligence (AI) techniques and introducing data science as a new discipline.\r\n\r\nHowever, although utilizing these new technologies will help companies simplify their operations and drive down costs, nothing is simple about getting the strategic approach right for your data science investment. This cheat sheet gives you a peak at the fundamental concepts you need to be on top of when building your data science strategy. It looks not only at investing in a top performing data science team, but also what to consider in your data architecture and how to approach the commercial aspects of data science.","blurb":"","authors":[],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General (Data Science)","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[{"articleId":262585,"title":"Data Science Techniques You Can Use for Successful Change Management","slug":"data-science-techniques-you-can-use-for-successful-change-management","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262585"}},{"articleId":262577,"title":"Current Trends in Data","slug":"current-trends-in-data","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262577"}},{"articleId":262574,"title":"The Ethics of Artificial Intelligence","slug":"the-ethics-of-artificial-intelligence","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262574"}},{"articleId":262569,"title":"10 Mistakes to Avoid When Investing in Data Science","slug":"10-mistakes-to-avoid-when-investing-in-data-science","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262569"}},{"articleId":262563,"title":"Data Science Careers: The Roles in a Data Science Team","slug":"data-science-careers-the-roles-in-a-data-science-team","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262563"}}],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281679,"slug":"data-science-strategy-for-dummies","isbn":"9781119566250","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119566258/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119566258/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119566258-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119566258/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119566258/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/data-science-strategy-for-dummies-cover-9781119566250-203x255.jpg","width":203,"height":255},"title":"Data Science Strategy For Dummies","testBankPinActivationLink":"","bookOutOfPrint":false,"authorsInfo":"\n <p><b data-author-id=\"27914\">Ulrika Jägare</b> is an M.Sc. Director at Ericsson AB. With a decade of experience in analytics and machine intelligence and 19 years in telecommunications, she has held leadership positions in R&amp;D and product management. Ulrika was key to the Ericsson??s Machine Intelligence strategy and the recent Ericsson Operations Engine launch ? a new data and AI driven operational model for Network Operations in telecommunications. </p>","authors":[{"authorId":27914,"name":"Ulrika Jägare","slug":"ulrika-jagare","description":"Ulrika Jägare, MSc, is a director at Ericsson AB. With a decade of experience in analytics and machine intelligence and over 20 years in telecommunications, she has held leadership positions in R&D and product management.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/27914"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119566250&quot;]}]\" id=\"du-slot-6259b2e20859b\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119566250&quot;]}]\" id=\"du-slot-6259b2e209090\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":262005,"title":"Data Science Strategy: Machine Learning Basics","slug":"","categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262005"}},{"articleId":262037,"title":"What Does it Mean to Be a Data-Driven Organization?","slug":"","categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262037"}},{"articleId":262039,"title":"Defining and Scoping a Data Science Strategy","slug":"","categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262039"}},{"articleId":262045,"title":"The Basics of Data for Data Science","slug":"","categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262045"}},{"articleId":262047,"title":"Knowing the Value of Data","slug":"","categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262047"}}],"content":[{"title":"Data science strategy: Machine learning basics","thumb":null,"image":null,"content":"<p>What&#8217;s the difference between advanced analytics and machine learning? When it is advisable to go for one approach or the other? It’s always good to start out by defining machine learning. <a href=\"https://www.dummies.com/web-design-development/other-web-software/development-machine-learning/\" target=\"_blank\" rel=\"noopener\">Machine learning</a> (ML) is the scientific study of algorithms and statistical models that computer systems use to progressively improve their performance on a specific task. Machine learning algorithms build a mathematical model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to perform the task.</p>\n<div class=\"figure-container\"><figure id=\"attachment_262043\" aria-labelledby=\"figcaption_attachment_262043\" class=\"wp-caption alignleft\" style=\"width: 545px\"><img loading=\"lazy\" class=\"wp-image-262043 size-full\" src=\"https://www.dummies.com/wp-content/uploads/data-science-strategy.jpg\" alt=\"data science\" width=\"535\" height=\"356\" /><figcaption id=\"figcaption_attachment_262043\" class=\"wp-caption-text\">©g-whiteMocca/Shutterstock</figcaption></figure></div><div class=\"clearfix\"></div>\n<p>&nbsp;</p>\n<p>&nbsp;</p>\n<p>&nbsp;</p>\n<p>&nbsp;</p>\n<p>&nbsp;</p>\n<p>&nbsp;</p>\n<p>&nbsp;</p>\n<p>&nbsp;</p>\n<p>&nbsp;</p>\n<p>&nbsp;</p>\n<p>So, here&#8217;s how advanced analytics and machine learning have some characteristics in common:</p>\n<ul>\n<li>Both advanced analytics and machine learning techniques are used for building and executing advanced mathematical and statistical models as well as building optimized models that can be used to predict events before they happen.</li>\n<li>Both methods use data to develop the models, and both require defined model policies.</li>\n<li>Automation can be used to run both analytics models and machine learning models after they’re put into production.</li>\n</ul>\n<p>What about the differences between advanced analytics and machine learning?</p>\n<ul>\n<li>There is a difference in who the actor is when creating your model. In an advanced analytics model, the actor is human; in a machine learning model, the actor is (obviously) a machine.</li>\n<li>There is also a difference in the model format. Analytics models are developed and deployed with the human-defined design, whereas machine learning models are dynamic and change design and approach as they’re being trained by the data, optimizing the design along the way. Machine learning models can also be deployed as <em>dynamic,</em> which means that they continue to train, learn and optimize the design when exposed to real-life data and its live context.</li>\n<li>Another difference between analytical models and machine learning models regards the difference in how models are tested using data (for analytics) and trained using data (for machine learning). In analytics data is used to test that the defined outcome is achieved as expected, while in machine learning, the data is used to train the model to optimize its design depending on the nature of the data.</li>\n<li>Finally, the techniques and tools used to develop advanced analytics models and machine learning models differ. Machine learning modeling techniques are much more advanced and are built on other principles related to how the machine will learn to optimize the model performance.</li>\n</ul>\n"},{"title":"What does it mean to be a data-driven organization?","thumb":null,"image":null,"content":"<p>Data is the new black! Or the new oil! Or the new gold! Whatever you compare data to, it’s probably true from a conceptual value perspective. As a society, we have now entered a new era of data and intelligent machines. And data science isn’t a passing trend or something that you can or should avoid. Instead, you should embrace it and ask yourself whether you understand enough about it to leverage it in your business. Be open-minded and curious! Dare to ask yourself whether you truly understand what being data-driven means.</p>\n<p>If you start by putting the ongoing changes happening in society into a wider context, it’s a common understanding that we humans are now experiencing a fourth industrial revolution, driven by access to data and advanced technology. It’s also referred to as the digital revolution. But beware! Digitizing or digitalizing your business isn’t the same as being data-driven.</p>\n<p class=\"article-tips remember\">Digitization is a widely used concept that basically refers to transitioning from analog to digital, like the conversion of data to a digital format. In relation to that, digitalization refers to making the digitized information work in your business.</p>\n<p>The concept of digitalizing a business is sometimes mixed up with being data-driven. However, it’s vital to remember that digitalizing the data isn’t just a good thing to do — it’s the foundation for enabling a data-driven enterprise. Without digitalization, you simply cannot become data-driven.</p>\n<p>In a data-driven organization, the starting point is data. It’s truly the foundation of everything. But what does that actually mean? Well, being data-driven means that you need to be ready to take data seriously. And what does <em>that</em> mean? Well, in practice, it means that data is the starting point and you analyze data and understand what type of business you should be doing. You must take the outcome of the analysis seriously enough to be prepared to change your business models accordingly. You must be ready to trust and use the data to drive your business forward. It should be your main concern in the company. You need to become “data-obsessed.”</p>\n"},{"title":"Defining and scoping a data science strategy","thumb":null,"image":null,"content":"<p>There&#8217;s a difference between a <em>data science strategy</em> and a <em>data strategy. </em>On a high-level, a data science strategy refers to the strategy you define with regards to the entire data science investment in your company. A data science strategy includes areas such as overall data science objectives and strategic choices, regulatory strategies, data need, competences and skillsets, data architecture, as well as how to measure the outcome.</p>\n<p>The data strategy on the other hand, constitutes a subset of the data science strategy, and is focused on outlining the strategic direction directly related to the data. This includes areas such as data scope, data consent, legal, regulatory and ethical considerations, storage collection frequency, data storage retention periods, data management process and principles, and last, but not least; data governance.</p>\n<p class=\"article-tips remember\">Both strategies are needed in order to succeed with your data science investment and should complement each other in order to work.</p>\n<p>If you ask about the <em>objectives</em> of a data science strategy, you’re asking whether there are clear company objectives set and agreed on for any of the investments made in data science. Are the objectives of your data science strategy formulated in a way that makes them possible to execute and measure success by? If not, then the objectives need to be reformulated; this is a critically important starting point that must be completed properly in order to succeed down the line.</p>\n<p>Data science is a new field that holds amazing opportunities for companies to drive a fundamental transformation, but it is complex and often not fully understood by top management. You should consider whether the executive team&#8217;s understanding of data science is sufficient to set the right targets or whether they need to be educated and then guided in setting their target.</p>\n<p class=\"article-tips tip\">Whether you’re a manager or an employee in a small or large company, if you want your company to succeed with its data science investment, don’t sit and hope that the leadership of your company will understand what needs to be done. If you’re knowledgeable in the area, make your voice heard or, if you aren’t, don’t hesitate to accept help from those who have experience in the field.</p>\n<p class=\"article-tips remember\">If you decide to bring in external experts to assist you in your data science strategizing, be sure to read up on the area yourself first, so that you can judge the relevance of their recommendations for your business — the place where you are the expert.</p>\n"},{"title":"The basics of data for data science","thumb":null,"image":null,"content":"<p>The terms <em>data</em> and <em>information</em> are often used interchangeably; there is a difference between them, however. For example, data can be described as raw, unorganized facts that need to be processed — a collection of numbers, symbols, or characters before it has been cleaned and corrected. <a href=\"https://www.dummies.com/programming/big-data/data-science/how-to-convert-raw-data-into-a-predictive-analysis-matrix/\" target=\"_blank\" rel=\"noopener\">Raw data</a> needs to be corrected to remove flaws like outliers and data entry errors.</p>\n<p>Raw data can be generated in many different ways. <em>Field</em> data, for example, is raw data that has been collected in an uncontrolled live environment. <em>Experimental </em>data has been generated within the context of a scientific investigation by observation and recording. Data can be as simple and seemingly random and useless until it’s organized, but once data is processed, organized, structured, or presented in a given context that makes it useful, it’s called <em>information.</em></p>\n<p>Historically, the concept of data has been most closely associated with scientific research, but now data is being collected, stored, and used by an increasing number of companies, organizations, and institutions. For companies, examples of interesting data can be customer data, product data, sales data, revenue, and profits; for governments, it can include data such as crime rates and unemployment rates.</p>\n<p>During the second half of the 1900s, there were several attempts to standardize the categorization and structure of data in order to make sense of its various forms. One well-known model for this is the DIKW (<em>d</em>ata, <em>i</em>nformation, <em>k</em>nowledge, and <em>w</em>isdom) pyramid, described in the following list; the first version of this model was drafted already in the mid-1950s, but it first appeared in its current state in the mid-1990s, as an attempt to make sense of the growing amounts of data (raw or processed) that were being generated from different computer systems:</p>\n<ul>\n<li><strong>Data </strong>is raw. It simply exists and has no significance beyond its existence (in and of itself). It can exist in any form, usable or not. Data represents a fact or statement of event without relation to other factors — <em>it’s raining, </em>for example.</li>\n<li><strong>Information</strong> is data that has been given a meaning by way of some sort of relationship. This meaning can be useful, but does not have to be. The information relationship can be related to cause-and-effect —<em>the temperature dropped 15 degrees and then it started raining</em>, for example.</li>\n<li><strong>Knowledge</strong> is the collection of information with the purpose to be useful. It represents a pattern that connects discrete elements and generally provides a high level of predictability for what is described or what will happen next: <em>If the humidity is very high and the temperature drops substantially, the atmosphere is often unlikely to be able to hold the moisture, and so it rains,</em> for example.</li>\n<li><strong>Wisdom</strong> exemplifies more of an understanding of fundamental principles within the knowledge that essentially form the basis of the knowledge being what it is. Wisdom is essentially like a shared understanding that is not questioned; <em>It rains because it rains,</em> for example<em>.</em> And this encompasses an understanding of all interactions that happen between raining, evaporation, air currents, temperature gradients, changes, and rain.</li>\n</ul>\n<p>The DIKW pyramid offered a new way to categorize data as it passes through different stages in its life cycle and has gained some attention over the years. However, it has also been criticized, and variants have appeared that were designed to improve on the original. One major criticism has been that, although it’s easy enough to understand the step from data to information, it’s much harder to draw a clear and valid line from information to knowledge and from knowledge to wisdom, making it difficult to apply in practice.</p>\n<p class=\"article-tips remember\">Conceptual models are heuristic devices: They’re useful only insofar as they offer a way to learn something new. One model or another may be more appealing to you, but from the perspective of a data science implementation, the most important thing for you to consider is a question like this: Will my company gain value from having the four levels of the DIKW pyramid, or will it just make implementation more difficult and complex?</p>\n"},{"title":"Knowing the value of data","thumb":null,"image":null,"content":"<p>The statement “Data is the new oil” is one that lots of people make, but what does it mean? In some ways, the analogy <em>does</em> fit: It’s easy to draw parallels because of the way information (data) is used to drive much of the transformative technology available today via artificial intelligence, machine learning, automation, and advanced analytics — much like oil drives the global industrial economy.</p>\n<p>So, as a marketing approach and a high-level description, the expression does its job, but if you take it as an indication of how to strategically address the value of data, it might lead to investments that cannot be turned into value. For example, storing data has no guaranteed future value, like oil has. Storing even more data has even less value because it becomes even more difficult to find it so that you can put it to use. The value in data lies not in saving it up or storing it — it lies in putting it to use, over and over again. That´s when the value in data is realized.</p>\n<p>If you start by looking at the core of the analogy, you can see that it refers to the value aspects of data as an enabler of a fundamental transformation of society — just like oil has proven to be throughout history. From that perspective, it definitely showcases the similarities between oil and data. Another similarity is that, although inherently valuable, data needs processing — just as oil needs refining — before its true value can be unlocked.</p>\n<p>However, data also has many other aspects that cause the analogy to fall apart when examined more closely. To see what this means, check out some of the differences between these two enablers of transformation:</p>\n<ul>\n<li><strong>Availability: </strong>Though oil is a finite resource, data is an endless and constantly increasing resource. This means that treating data like oil (hoarding it and storing it in siloes, for example) has little benefit and reduces its usefulness. Nevertheless, because of the misconception that data is similar to oil (scarce), this is often exactly what is done with the data, driving investments and behavior in the wrong direction.</li>\n<li><strong>Reusability:</strong> Data becomes more useful the more it’s used, which is the exact opposite of what happens with oil. When oil is used to generate energy like heat or light, or when oil is permanently converted into another form such as plastic, the oil is gone and cannot be reused. Therefore, treating data like oil — using it once and then assuming that its usefulness has been exhausted and disposing of it — is definitely a mistake.</li>\n<li><strong>Capture:</strong> Everyone knows that as the world’s oil reserves decline, extracting it becomes increasingly difficult and expensive. With data, on the other hand, it’s becoming increasingly available as the digitalization of society increases.</li>\n<li><strong>Variety:</strong> Data also has far more variety than oil. The raw oil that’s drilled from the ground is processed in a variety of ways into many different products, of course, but in its raw state, it’s all the same. Data in its raw format can represent words, pictures, sounds, ideas, facts, measurements, statistics, or any other characteristic that can be processed by computers.</li>\n</ul>\n<p class=\"article-tips remember\">The fact nevertheless remains that the quantities of data available today comprise an entirely new commodity, though the rules for capturing, storing, treating, and using data are still being written. Let&#8217;s stress here, however, that data, like oil, is a vital source of power and that the companies that utilize the available data in the most optimized way (thereby controlling the market) are establishing themselves as the leaders of the world economy, just as the oil barons did a hundred years ago.</p>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2022-04-15T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":262049},{"headers":{"creationTime":"2016-03-27T16:46:51+00:00","modifiedTime":"2022-04-12T20:12:23+00:00","timestamp":"2022-04-13T00:01:05+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"Big Data","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33578"},"slug":"big-data","categoryId":33578}],"title":"Big Data for Small Business For Dummies Cheat Sheet","strippedTitle":"big data for small business for dummies cheat sheet","slug":"big-data-for-small-business-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"Discover the key terminology you need to understand the crucial big data skills for businesses and how to communicate data to your company.","noIndex":0,"noFollow":0},"content":"Big data makes big headlines, but it’s much more than just a buzz phrase or the latest business fad. The phenomenon is very real and it’s producing concrete benefits in so many different areas – particularly in business. Here you will get to the heart of big data as a business owner or manager: You will take a look at the key terminology you need to understand the crucial big data skills for businesses, ten steps to using big data to make better decisions, and tips for communicating insights from data to your colleagues.","description":"Big data makes big headlines, but it’s much more than just a buzz phrase or the latest business fad. The phenomenon is very real and it’s producing concrete benefits in so many different areas – particularly in business. Here you will get to the heart of big data as a business owner or manager: You will take a look at the key terminology you need to understand the crucial big data skills for businesses, ten steps to using big data to make better decisions, and tips for communicating insights from data to your colleagues.","blurb":"","authors":[{"authorId":9052,"name":"Bernard Marr","slug":"bernard-marr","description":"Bernard Marr helps companies to better manage, measure, report and analyze performance. His leading-edge work with major companies, organizations and governments across the globe makes him an acclaimed and award-winning keynote speaker, researcher, consultant and teacher. Learn more at bernardmarr.com.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9052"}}],"primaryCategoryTaxonomy":{"categoryId":33578,"title":"Big Data","slug":"big-data","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33578"}},"secondaryCategoryTaxonomy":{"categoryId":34253,"title":"General (Small Business)","slug":"general-small-business","_links":{"self":"https://dummies-api.dummies.com/v2/categories/34253"}},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[{"articleId":140207,"title":"10 Big Data Predictions for the Future","slug":"10-big-data-predictions-for-the-future","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/140207"}},{"articleId":140196,"title":"Big Data: Starting with Strategy","slug":"big-data-starting-with-strategy","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/140196"}},{"articleId":140195,"title":"Overcoming the Big Data Skills Shortage","slug":"overcoming-the-big-data-skills-shortage","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/140195"}},{"articleId":140190,"title":"Understanding Big Data and the Internet of Things","slug":"understanding-big-data-and-the-internet-of-things","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/140190"}},{"articleId":140156,"title":"6 Key Big Data Skills Every Business Needs","slug":"6-key-big-data-skills-every-business-needs","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/140156"}}],"fromCategory":[{"articleId":207996,"title":"Big Data For Dummies Cheat Sheet","slug":"big-data-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/207996"}},{"articleId":207478,"title":"Statistics for Big Data For Dummies Cheat Sheet","slug":"statistics-for-big-data-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/207478"}},{"articleId":168988,"title":"Integrate Big Data with the Traditional Data Warehouse","slug":"integrate-big-data-with-the-traditional-data-warehouse","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/168988"}},{"articleId":168987,"title":"Best Practices for Big Data Integration","slug":"best-practices-for-big-data-integration","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/168987"}},{"articleId":168985,"title":"How to Analyze Big Data to Get Results","slug":"how-to-analyze-big-data-to-get-results","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/168985"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281550,"slug":"big-data-for-small-business-for-dummies","isbn":"9781119027034","categoryList":["technology","information-technology","data-science","big-data"],"amazon":{"default":"https://www.amazon.com/gp/product/1119027039/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119027039/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119027039-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119027039/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119027039/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/big-data-for-small-business-for-dummies-cover-9781119027034-203x255.jpg","width":203,"height":255},"title":"Big Data For Small Business For Dummies","testBankPinActivationLink":"","bookOutOfPrint":false,"authorsInfo":"\n <p><b data-author-id=\"9052\">Bernard Marr</b> helps companies to better manage, measure, report and analyse performance. His leading-edge work with major companies, organisations and governments across the globe makes him an acclaimed and award-winning keynote speaker, researcher, consultant and teacher. </p>","authors":[{"authorId":9052,"name":"Bernard Marr","slug":"bernard-marr","description":"Bernard Marr helps companies to better manage, measure, report and analyze performance. His leading-edge work with major companies, organizations and governments across the globe makes him an acclaimed and award-winning keynote speaker, researcher, consultant and teacher. Learn more at bernardmarr.com.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9052"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;big-data&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119027034&quot;]}]\" id=\"du-slot-625612c119a7a\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;big-data&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119027034&quot;]}]\" id=\"du-slot-625612c119fab\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":140154,"title":"Understanding Big Data Jargon","slug":"understanding-big-data-jargon","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/140154"}},{"articleId":140156,"title":"6 Key Big Data Skills Every Business Needs","slug":"6-key-big-data-skills-every-business-needs","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/140156"}},{"articleId":140157,"title":"10 Steps to Using Data to Improve Business Decisions","slug":"10-steps-to-using-data-to-improve-business-decisions","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/140157"}},{"articleId":140155,"title":"How to Communicate Insights from Big Data","slug":"how-to-communicate-insights-from-big-data","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/140155"}}],"content":[{"title":"Understanding big data jargon","thumb":null,"image":null,"content":"<p>The technical jargon surrounding big data can seem a little daunting at first. The key phrases and terms you’re likely to come across, with easy-to-understand definitions for each, follow:</p>\n<ul class=\"level-one\">\n<li>\n<p class=\"first-para\"><b>Big data: </b>Increasingly, everything you do leaves a digital trace (or data), which you (and others) can use and analyse. The phrase <i>big data</i> refers to that data being collected and the ability to make use of it.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Big data analytics:</b><i> </i>This is the process of collecting, processing and analysing data to generate insights that inform fact-based decision making. In many cases it involves software-based analysis using algorithms.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Algorithm: </b>A mathematical formula or statistical process run by software to analyse data. It usually involves multiple calculation steps and can be used to automatically process data or solve problems.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Cloud computing:</b><i> </i>Software or data running on remote servers, rather than locally. So instead of storing or computing things on your own machine, you can use other computers that are connected to your computer via a network (such as the Internet).</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Structured data:</b><i> </i>Any data or information located in a fixed field within a defined record or file, such as a database or spreadsheet. Its inherent structure makes it quick, easy and cheap to analyse.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Unstructured data:</b><i> </i>All the data not easily stored and indexed in traditional formats or databases. It includes email conversations, social media posts, video content, photos, voice recordings, sounds and so on. Its lack of structure makes it more difficult to analyse using traditional computer programs.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Semi-structured data:</b><i> </i>You guessed it, this is a cross between unstructured and structured data. It’s data that may have some structure that can be used for analysis but lacks the strict structure found in databases or spreadsheets. For example, a Facebook post can be categorised by author, date, length and even sentiment, but the content is generally unstructured.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Internal data: </b>This accounts for all the data your business currently has or could potentially access or generate in future. It could be structured in format (for example, a customer database) or it could be unstructured (conversational data from customer service calls).</p>\n</li>\n<li>\n<p class=\"first-para\"><b>External data: </b>Put simply, this is the infinite array of information that exists outside your business. It can be publically available or privately held and it can also be structured or unstructured in format.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>The Internet of Things:</b><i> </i>A network that connects devices (the <i>things</i> referred to in the name) so that they can communicate with each other. This encompasses technology like smart televisions, smart phones, and sensors, and it’s all possible thanks to the massive increase in connectivity between devices, systems and services.</p>\n</li>\n</ul>\n"},{"title":"6 key big data skills every business needs","thumb":null,"image":null,"content":"<p>What are the key skills required to use big data successfully? The list here includes six key skills that all businesses should develop, either through recruiting data scientists who match these attributes, or by developing these skills in existing employees:</p>\n<ul class=\"level-one\">\n<li>\n<p class=\"first-para\"><b>Analytics:</b> This involves determining which data is relevant to the question you’re hoping to answer and interpreting the data in order to derive those answers. Key skills include a knack for spotting patterns and establishing links, the ability to make sense of a range of data (both structured and unstructured) and a sound knowledge of industry-standard analytics packages like SAS Analytics and Oracle Data Mining.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Creativity:</b> Anyone can be formulaic – you need to aim for innovation that will set your business apart from the pack. Creativity is especially important for any business hoping to make sense of <i>unstructured data</i> – data that doesn’t fit comfortably into tables and charts. Valuable creative skills include a knack for problem solving (perhaps even spotting problems others aren’t yet aware of) and the ability to come up with new ways of gathering and interpreting data.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Maths and statistics: </b>People with a strong background in maths or statistics have a good grounding for big data-related work. You’re looking for at least a basic grasp of statistics and the ability to wrangle messy data into figures that can be quantified so that you can draw conclusions from them.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Computer science:</b> This very broad category covers a whole range of subfields, such as machine learning, databases and cloud computing. It may cover everything from plugging together the cables to creating sophisticated machine learning and natural language processing algorithms. Key skills include a solid understanding of database technology and a firm grasp of technologies such as Hadoop, Java and Python.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Business acumen: </b>People who work with big data need a firm grasp of the company’s goals and objectives, as well as an understanding of whether the business is heading in the right direction. This includes understanding what makes the company tick, what makes it thrive and why it stands out from its competitors (and if it’s not thriving, why it’s not).</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Communication: </b>You can have the best analytical skills in the world, but unless you’re able to present findings in a clear way and demonstrate how they can help to improve performance and drive success, all that analysis will go to waste. Great interpersonal and written communication skills are vital, as is the ability to add value to data through insights and analysis. A knack for storytelling and being able to bring data to life through visualization techniques will also help immensely.</p>\n</li>\n</ul>\n"},{"title":"10 steps to using data to improve business decisions","thumb":null,"image":null,"content":"<p>Data should be at the heart of strategic decision making in business, whether you run a huge multinational or a small family-run business. Big data can provide insights that help you answer your key business questions, such as ‘How can I improve customer satisfaction?’. Data leads to insights; business owners and managers can turn those insights into decisions and actions that improve the business.</p>\n<p class=\"Tip\">Use this ten-step process for making data-based decisions:</p>\n<ol class=\"level-one\">\n<li>\n<p class=\"first-para\">Start with strategy.</p>\n<p class=\"child-para\">Instead of starting with what data you could or should access, start by working out what your business is looking to achieve. In a nutshell, you need to work out what your strategic goals are, for example, increasing your customer base.</p>\n</li>\n<li>\n<p class=\"first-para\">Hone in on the business area; identify your strategic objectives.</p>\n<p class=\"child-para\">Identify the areas most important to achieving your overall strategy. For most businesses, the customer, finance and operations areas are key.</p>\n</li>\n<li>\n<p class=\"first-para\">Identify unanswered questions.</p>\n<p class=\"child-para\">Work out which questions you need to answer in order to achieve those goals. By working out exactly what you need to know, you can focus on the data that you really need.</p>\n</li>\n<li>\n<p class=\"first-para\">Find the data that will help answer those questions.</p>\n<p class=\"child-para\">Focus on identifying the ideal data for you – the data that could help you answer your most pressing questions and deliver on your strategic objectives.</p>\n</li>\n<li>\n<p class=\"first-para\">Identify what data you already have or have access to.</p>\n<p class=\"child-para\">After you identify the data you need, it makes sense to see if you’re already sitting on some of that information, even if it isn’t immediately obvious.</p>\n</li>\n<li>\n<p class=\"first-para\">Work out if the costs and effort are justified.</p>\n<p class=\"child-para\">Only after you know the costs can you work out if the tangible benefits outweigh those costs. In this respect, you should treat data like any other key business investment. You need to make a clear case for the investment that outlines the long-term value of data to the business strategy.</p>\n</li>\n<li>\n<p class=\"first-para\">Collect the data.</p>\n<p class=\"child-para\">Much of this step comes down to setting up the processes and people to gather and manage your data. You may be buying access to an analysis-ready data set, in which case there’s no need to collect data as such. But, in reality, many data projects require some amount of data collection.</p>\n</li>\n<li>\n<p class=\"first-para\">Analyze the data.</p>\n<p class=\"child-para\">You need to analyze the data in order to extract meaningful and useful business insights. After all, there’s no point coming this far if you don’t then discover something new from the data.</p>\n</li>\n<li>\n<p class=\"first-para\">Present and distribute the insights.</p>\n<p class=\"child-para\">Unless the results are presented to the right people at the right time in a meaningful way, then the size of the data sets or the sophistication of the analytics tools don’t really matter. You need to make sure the insights gained from your data are used to inform decision making and, ultimately, improve performance.</p>\n</li>\n<li>\n<p class=\"first-para\">Incorporate the learning into the business.</p>\n<p class=\"child-para\">Finally, you need to apply the insights from the data to your decision making, making the decisions that will transform your business for the better – and then acting on those decisions. For me, this is the most rewarding part of the data journey: turning data into action.</p>\n</li>\n</ol>\n"},{"title":"How to communicate insights from big data","thumb":null,"image":null,"content":"<p>Big data can help you gain insight. Businesses gain competitive advantage when the <i>right information</i> is delivered to the <i>right people</i> at the <i>right time.</i> This means extracting insights and information from data and communicating them to decision makers in a way they’ll easily understand. After all, people are less likely to act if they have to work hard to understand the information in front of them.</p>\n<p class=\"Tip\">Make sure your insights shine through with these top tips:</p>\n<ul class=\"level-one\">\n<li>\n<p class=\"first-para\"><b>Identify your target audience. </b>Who your audience is depends on your strategic questions. The audience may be you if you’re the business owner, or it could be your human resources team, your marketing team or a combination. Ask yourself who’s going to see these results. What do they already know about the issues being discussed? What do they need and want to know? And, what will they do with the information?</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Customise the information for your audience.</b> Be prepared to customise your information to meet the specific requirements of each decision maker.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Remember what you’re trying to achieve.</b> Try not to get distracted by interesting insights that have nothing to do with answering your strategic questions and achieving your business goals. There may be scope to revisit those other insights in future but, for now, focus on what you set out to achieve.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Avoid creating a wall of text.</b> Remember that data can be presented as a number, a short written narrative, a table, a graph or a chart. In fact, the best approach is likely to involve a combination of these formats.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Use data visualisation techniques.</b> Visuals are great for conveying information because they’re quick and direct, they’re (usually) easy to understand, they’re memorable and they add interest, being much more likely to hold the reader’s attention than a full page of text.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>But don’t neglect the text. </b>Numbers, charts and visuals may only give a snapshot; narrative allows you to embellish on key points. Use short narratives to introduce what you’re showing and highlight the key insights.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Use clear headings to make the important points stand out.</b> This way, even at a quick glance, the key points will be obvious.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Link the information to your strategy. </b>If you’re presenting information that directly answers a strategic business question, such as ‘How do we reduce staff turnover by ten per cent?’, include that question in the opening narrative and maybe even the headline.</p>\n</li>\n</ul>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2022-04-12T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":207432},{"headers":{"creationTime":"2016-03-27T16:51:15+00:00","modifiedTime":"2022-03-25T14:42:17+00:00","timestamp":"2022-03-25T18:01:12+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"Databases","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33579"},"slug":"databases","categoryId":33579}],"title":"Records Management For Dummies Cheat Sheet","strippedTitle":"records management for dummies cheat sheet","slug":"records-management-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"Learn the key aspects of records management, including the benefits, retention scheduling, and managing on local and network drives.","noIndex":0,"noFollow":0},"content":"Whether you’re a small business owner or work for a global corporation, you deal with information every day. You receive information, you send it, you determine what’s relevant, and you make decisions, whether consciously or subconsciously, about what information to retain. That’s why records management — managing the flood of information you get every day — should be such an important part of your business strategy.\r\n\r\nThis Cheat Sheet can serve as a quick reference to some of the main aspects of record management.","description":"Whether you’re a small business owner or work for a global corporation, you deal with information every day. You receive information, you send it, you determine what’s relevant, and you make decisions, whether consciously or subconsciously, about what information to retain. That’s why records management — managing the flood of information you get every day — should be such an important part of your business strategy.\r\n\r\nThis Cheat Sheet can serve as a quick reference to some of the main aspects of record management.","blurb":"","authors":[{"authorId":9881,"name":"Blake Richardson CRM","slug":"blake-richardson","description":"Blake Richardson, CRM, is a certified records manager with more than 15 years of experience managing records and information for several Fortune 500 companies. He has been a records manager for CNA Insurance and the Dollar General Corporation, and is active in ARMA International. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9881"}},{"authorId":9882,"name":"CRM","slug":"crm","description":"","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9882"}}],"primaryCategoryTaxonomy":{"categoryId":33579,"title":"Databases","slug":"databases","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33579"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[{"articleId":173434,"title":"Appraising Records and Managing Retention Scheduling","slug":"appraising-records-and-managing-retention-scheduling","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173434"}},{"articleId":173432,"title":"Managing Records on Local and Network Drives","slug":"managing-records-on-local-and-network-drives","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173432"}},{"articleId":173421,"title":"Benefits of Records Management","slug":"benefits-of-records-management","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173421"}}],"fromCategory":[{"articleId":285910,"title":"Data Lakes For Dummies Cheat Sheet","slug":"data-lakes-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/285910"}},{"articleId":207503,"title":"SAS For Dummies Cheat Sheet","slug":"sas-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/207503"}},{"articleId":193380,"title":"Selecting the Correct SAS Product","slug":"selecting-the-correct-sas-product","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/193380"}},{"articleId":173434,"title":"Appraising Records and Managing Retention Scheduling","slug":"appraising-records-and-managing-retention-scheduling","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173434"}},{"articleId":173432,"title":"Managing Records on Local and Network Drives","slug":"managing-records-on-local-and-network-drives","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173432"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281849,"slug":"records-management-for-dummies","isbn":"9781118388082","categoryList":["technology","information-technology","data-science","databases"],"amazon":{"default":"https://www.amazon.com/gp/product/1118388089/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1118388089/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1118388089-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1118388089/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1118388089/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/records-management-for-dummies-cover-9781118388082-203x255.jpg","width":203,"height":255},"title":"Records Management For Dummies","testBankPinActivationLink":"","bookOutOfPrint":false,"authorsInfo":"\n <p>Blake Richardson, CRM, is a Certified Records Manager with more than 15 years of experience managing records and information for several Fortune 500 companies. He has been a records manager for CNA Insurance and the Dollar General Corporation, and is active in ARMA International.</p> ","authors":[{"authorId":9881,"name":"Blake Richardson CRM","slug":"blake-richardson","description":"Blake Richardson, CRM, is a certified records manager with more than 15 years of experience managing records and information for several Fortune 500 companies. He has been a records manager for CNA Insurance and the Dollar General Corporation, and is active in ARMA International. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9881"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;databases&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781118388082&quot;]}]\" id=\"du-slot-623e0368d89eb\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;databases&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781118388082&quot;]}]\" id=\"du-slot-623e0368d9352\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":173421,"title":"Benefits of Records Management","slug":"benefits-of-records-management","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173421"}},{"articleId":173434,"title":"Appraising Records and Managing Retention Scheduling","slug":"appraising-records-and-managing-retention-scheduling","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173434"}},{"articleId":173432,"title":"Managing Records on Local and Network Drives","slug":"managing-records-on-local-and-network-drives","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173432"}}],"content":[{"title":"Benefits of records management","thumb":null,"image":null,"content":"<p>Properly managing your records can help you reduce operating expense, enhance customer service and ensure your company is in compliance with laws and regulation.</p>\n<ul class=\"level-one\">\n<li>\n<p class=\"first-para\"><b>Reduce operating expense.</b> Properly managing your records and information means that you only retain for a specific timeframe what you need for operational, legal, and compliance purposes and then appropriately dispose of it. This approach can eliminate the need to buy additional paper filing equipment and electronic storage. It can also reduce the cost of off-site record storage.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Enhance customer service.</b> A good records management strategy ensures that you don’t retain unneeded information. This reduces the amount of clutter you have to search through. Being able to quickly locate the right information allows you to better serve your customers.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Legal and compliance.</b> Implementing an effective records management program will allow you to identify records that are needed for legal and compliance purposes and ensure they are assigned the appropriate retention periods. This benefits organizations in the event of lawsuits, audits, and governmental inquiries.</p>\n</li>\n</ul>\n"},{"title":"Appraising records and managing retention scheduling","thumb":null,"image":null,"content":"<p>A fundamental part of a successful records and information management program is identifying what records and information your company possesses and then applying appropriate retention periods.</p>\n<ul class=\"level-one\">\n<li>\n<p class=\"first-para\"><b>Records appraisal.</b> There are different methods for appraising your records and information, including inventories, interviews, and questionnaires. Picking the right appraisal method for your organization is important. The appraisal allows you to identify what records and information the organization possesses and forms the basis for your company’s record retention schedule.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Retention schedule.</b> The retention schedule is typically the most referenced records management program document. It allows employees to determine how long they should keep their records and information. <i>Records Management For Dummies</i> addresses three distinct retention schedule methods: departmental, functional, and big bucket.</p>\n</li>\n</ul>\n"},{"title":"Managing records on local and network drives","thumb":null,"image":null,"content":"<p>The amount of electronic information has grown exponentially over the past decade. Local (C: Drives) and network drives are becoming digital graveyards. An effective records management program will provide you with the knowledge and tools you need to ensure that electronic files are properly managed.</p>\n<ul class=\"level-one\">\n<li>\n<p class=\"first-para\"><b>Folder structures and names.</b> Creating an electronic folder structure that meets the needs of each department and ensuring that the folders within the structure are logically named are critical for accurate electronic filing, retrieval, and file maintenance (clean-up).</p>\n</li>\n<li>\n<p class=\"first-para\"><b>File naming conventions.</b> After you have created the proper electronic folder structure, it’s important to ensure that your files are also named in an appropriate manner. A rule of thumb for naming your files is that you should be able to determine the contents of a file without having to open it. This means avoiding cryptic names such as abbreviations, acronyms, and numbers.</p>\n</li>\n<li>\n<p class=\"first-para\"><b>Maintenance.</b> It is important to regularly review your folders and files to determine what is no longer needed. Microsoft Windows allows you to determine the last access and modification dates of a file. This is a good starting point in determining whether a file can be deleted. However, employees should always reference the organization’s record retention schedule before deleting any files in order to determine if the file is eligible for deletion.</p>\n</li>\n</ul>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"Two years","lifeExpectancySetFrom":"2022-03-25T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":208207},{"headers":{"creationTime":"2016-03-27T16:47:03+00:00","modifiedTime":"2022-03-10T20:12:30+00:00","timestamp":"2022-03-11T00:01:06+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"Big Data","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33578"},"slug":"big-data","categoryId":33578}],"title":"Statistics for Big Data For Dummies Cheat Sheet","strippedTitle":"statistics for big data for dummies cheat sheet","slug":"statistics-for-big-data-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"Overview of the three types of statistical measures: those of central tendency, central dispersion and association.","noIndex":0,"noFollow":0},"content":"Summary statistical measures represent the key properties of a sample or population as a single numerical value. This has the advantage of providing important information in a very compact form. It also simplifies comparing multiple samples or populations. Summary statistical measures can be divided into three types: measures of central tendency, measures of central dispersion, and measures of association.","description":"Summary statistical measures represent the key properties of a sample or population as a single numerical value. This has the advantage of providing important information in a very compact form. It also simplifies comparing multiple samples or populations. Summary statistical measures can be divided into three types: measures of central tendency, measures of central dispersion, and measures of association.","blurb":"","authors":[{"authorId":9080,"name":"Alan Anderson","slug":"alan-anderson","description":"Alan Anderson, PhD, is a professor of economics and finance at Fordham University and New York University. He's a veteran economist, risk manager, and fixed income analyst. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9080"}},{"authorId":9081,"name":"David Semmelroth","slug":"david-semmelroth","description":"David Semmelroth is an experienced data analyst, trainer, and statistics instructor who consults on customer databases and database marketing.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9081"}}],"primaryCategoryTaxonomy":{"categoryId":33578,"title":"Big Data","slug":"big-data","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33578"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[{"articleId":142226,"title":"Discrete and Continuous Probability Distributions","slug":"discrete-and-continuous-probability-distributions","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/142226"}},{"articleId":142209,"title":"10 Key Concepts in Hypothesis Testing","slug":"10-key-concepts-in-hypothesis-testing","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/142209"}},{"articleId":142192,"title":"Overview of Graphical Techniques","slug":"overview-of-graphical-techniques","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/142192"}},{"articleId":142191,"title":"Overview of Hypothesis Testing","slug":"overview-of-hypothesis-testing","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/142191"}},{"articleId":142183,"title":"Measures of Association","slug":"measures-of-association","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/142183"}}],"fromCategory":[{"articleId":207996,"title":"Big Data For Dummies Cheat Sheet","slug":"big-data-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/207996"}},{"articleId":207432,"title":"Big Data for Small Business For Dummies Cheat Sheet","slug":"big-data-for-small-business-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/207432"}},{"articleId":168988,"title":"Integrate Big Data with the Traditional Data Warehouse","slug":"integrate-big-data-with-the-traditional-data-warehouse","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/168988"}},{"articleId":168986,"title":"Big Data Planning Stages","slug":"big-data-planning-stages","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/168986"}},{"articleId":168985,"title":"How to Analyze Big Data to Get Results","slug":"how-to-analyze-big-data-to-get-results","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/168985"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":282602,"slug":"statistics-for-big-data-for-dummies","isbn":"9781118940013","categoryList":["technology","information-technology","data-science","big-data"],"amazon":{"default":"https://www.amazon.com/gp/product/1118940016/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1118940016/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1118940016-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1118940016/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1118940016/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/statistics-for-big-data-for-dummies-cover-9781118940013-203x255.jpg","width":203,"height":255},"title":"Statistics for Big Data For Dummies","testBankPinActivationLink":"","bookOutOfPrint":false,"authorsInfo":"\n <p><b data-author-id=\"9080\">Alan Anderson, PhD,</b> is a professor of economics and finance at Fordham University and New York University. He's a veteran economist, risk manager, and fixed income analyst.</p> <p><b data-author-id=\"9081\">David Semmelroth</b> is an experienced data analyst, trainer, and statistics instructor who consults on customer databases and database marketing.</p>","authors":[{"authorId":9080,"name":"Alan Anderson","slug":"alan-anderson","description":"Alan Anderson, PhD, is a professor of economics and finance at Fordham University and New York University. He's a veteran economist, risk manager, and fixed income analyst. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9080"}},{"authorId":9081,"name":"David Semmelroth","slug":"david-semmelroth","description":"David Semmelroth is an experienced data analyst, trainer, and statistics instructor who consults on customer databases and database marketing.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9081"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;big-data&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781118940013&quot;]}]\" id=\"du-slot-622a91426a907\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;big-data&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781118940013&quot;]}]\" id=\"du-slot-622a91426b287\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":142176,"title":"Measures of Central Tendency","slug":"measures-of-central-tendency","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/142176"}},{"articleId":142177,"title":"Measures of Central Dispersion","slug":"measures-of-central-dispersion","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/142177"}},{"articleId":142183,"title":"Measures of Association","slug":"measures-of-association","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/142183"}}],"content":[{"title":"Measures of central tendency","thumb":null,"image":null,"content":"<p>Measures of central tendency show the center of a data set. Three of the most commonly used measures of central tendency are the mean, median, and mode.</p>\n<h2>Mean</h2>\n<p><i>Mean </i>is another word for average. Here is the formula for computing the mean of a sample:</p>\n<p><img loading=\"lazy\" src=\"https://www.dummies.com/wp-content/uploads/484219.image0.jpg\" alt=\"image0.jpg\" width=\"77\" height=\"68\" /></p>\n<p>With this formula, you compute the sample mean by simply adding up all the elements in the sample and then dividing by the number of elements in the sample.</p>\n<p>Here is the corresponding formula for computing the mean of a population:</p>\n<p><img loading=\"lazy\" src=\"https://www.dummies.com/wp-content/uploads/484220.image1.jpg\" alt=\"image1.jpg\" width=\"73\" height=\"68\" /></p>\n<p>Although the notation is slightly different, the procedure for computing a population mean is the same as the procedure for computing a sample mean.</p>\n<p class=\"Tip\">Greek letters are used to describe populations, whereas Roman letters are used to describe samples.</p>\n<h2>Median</h2>\n<p>The <i>median</i> of a data set is a value that divides the data into two equal halves. In other words, half of the elements of a data set are <i>less than </i>the median, and the remaining half are <i>greater than </i>the median. The procedure for computing the median is the same for both samples and populations.</p>\n<h2>Mode</h2>\n<p>The mode of a data set is the most commonly observed value in the data set. You determine the mode in the same way for a sample and a population.</p>\n"},{"title":"Measures of central dispersion","thumb":null,"image":null,"content":"<p>Measures of central dispersion show how &#8220;spread out&#8221; the elements of a data set are from the mean. Three of the most commonly used measures of central dispersion include the following:</p>\n<ul class=\"level-one\">\n<li>\n<p class=\"first-para\">Range</p>\n</li>\n<li>\n<p class=\"first-para\">Variance</p>\n</li>\n<li>\n<p class=\"first-para\">Standard deviation</p>\n</li>\n</ul>\n<h2>Range</h2>\n<p>The <i>range</i> of a data set is the difference between the largest value and the smallest value. You compute it the same way for both samples and populations.</p>\n<h2>Variance</h2>\n<p>You can think of the variance as the average <i>squared</i> difference between the elements of a data set and the mean. The formulas for computing a sample variance and a population variance are slightly different.</p>\n<p>Here is the formula for computing sample variance:</p>\n<p><img loading=\"lazy\" src=\"https://www.dummies.com/wp-content/uploads/484223.image0.jpg\" alt=\"image0.jpg\" width=\"130\" height=\"68\" /></p>\n<p>And here is the formula for computing population variance:</p>\n<p><img loading=\"lazy\" src=\"https://www.dummies.com/wp-content/uploads/484224.image1.jpg\" alt=\"image1.jpg\" width=\"130\" height=\"68\" /></p>\n<h2>Standard deviation</h2>\n<p>The standard deviation is simply the square root of the variance. It&#8217;s more commonly used as a measure of dispersion than the variance because it&#8217;s measured in the same units as the elements of the data set, whereas the variance is measured in <i>squared </i>units.</p>\n"},{"title":"Measures of association","thumb":null,"image":null,"content":"<p>Measures of association quantify the strength and the direction of the relationship between two data sets. Here are the two most commonly used measures of association:</p>\n<ul class=\"level-one\">\n<li>\n<p class=\"first-para\">Covariance</p>\n</li>\n<li>\n<p class=\"first-para\">Correlation</p>\n</li>\n</ul>\n<p>Both measures are used to show how closely two data sets are related to each other. The main difference between them is the units in which they are measured. The correlation measure is defined to assume values between –1 and 1, which makes interpretation very easy.</p>\n<h2>Covariance</h2>\n<p>The <i>covariance</i> between two samples is computed as follows:</p>\n<p><img loading=\"lazy\" src=\"https://www.dummies.com/wp-content/uploads/484213.image0.jpg\" alt=\"image0.jpg\" width=\"189\" height=\"68\" /></p>\n<p>The covariance between two populations is computed as follows:</p>\n<p><img loading=\"lazy\" src=\"https://www.dummies.com/wp-content/uploads/484214.image1.jpg\" alt=\"image1.jpg\" width=\"208\" height=\"68\" /></p>\n<h2>Correlation</h2>\n<p>The <i>correlation</i> between two samples is computed like this:</p>\n<p><img loading=\"lazy\" src=\"https://www.dummies.com/wp-content/uploads/484215.image2.jpg\" alt=\"image2.jpg\" width=\"84\" height=\"44\" /></p>\n<p>The correlation between two populations is computed like this:</p>\n<p><img loading=\"lazy\" src=\"https://www.dummies.com/wp-content/uploads/484216.image3.jpg\" alt=\"image3.jpg\" width=\"94\" height=\"44\" /></p>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"Five years","lifeExpectancySetFrom":"2022-03-10T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":207478},{"headers":{"creationTime":"2016-03-27T16:47:09+00:00","modifiedTime":"2022-03-01T18:08:45+00:00","timestamp":"2022-03-02T00:01:02+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"Databases","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33579"},"slug":"databases","categoryId":33579}],"title":"SAS For Dummies Cheat Sheet","strippedTitle":"sas for dummies cheat sheet","slug":"sas-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"Learn how to choose the right statistical analysis system (SAS) product and use this handy list of procedures in the SAS Enterprise Guide.","noIndex":0,"noFollow":0},"content":"SAS Institute has hundreds of statistical analysis system products, so a partial list of the ones you might run will help you know which one to use for your job.\r\n\r\nThe tasks in <a href=\"https://support.sas.com/en/software/enterprise-guide-support.html\" target=\"_blank\" rel=\"noopener\">SAS Enterprise Guide</a> and SAS Add-In for Microsoft Office create SAS programs that call on SAS procedures. Having a list of those procedures and being able to find them quickly in SAS Enterprise Guide will boost your efficiency.","description":"SAS Institute has hundreds of statistical analysis system products, so a partial list of the ones you might run will help you know which one to use for your job.\r\n\r\nThe tasks in <a href=\"https://support.sas.com/en/software/enterprise-guide-support.html\" target=\"_blank\" rel=\"noopener\">SAS Enterprise Guide</a> and SAS Add-In for Microsoft Office create SAS programs that call on SAS procedures. Having a list of those procedures and being able to find them quickly in SAS Enterprise Guide will boost your efficiency.","blurb":"","authors":[{"authorId":9190,"name":"Stephen McDaniel","slug":"stephen-mcdaniel","description":"Stephen McDaniel is principal and cofounder of Freakalytics LLC, which provides training and consulting for data presentation, visual data exploration, and dashboard development. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9190"}},{"authorId":9191,"name":"Chris Hemedinger","slug":"chris-hemedinger","description":"Chris Hemedinger works in SAS R&D on the team that builds SAS Enterprise Guide, a popular user interface for SAS customers. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9191"}}],"primaryCategoryTaxonomy":{"categoryId":33579,"title":"Databases","slug":"databases","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33579"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[{"articleId":193380,"title":"Selecting the Correct SAS Product","slug":"selecting-the-correct-sas-product","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/193380"}},{"articleId":143291,"title":"SAS Procedures and Their Location in SAS Enterprise Guide","slug":"sas-procedures-and-their-location-in-sas-enterprise-guide","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/143291"}}],"fromCategory":[{"articleId":285910,"title":"Data Lakes For Dummies Cheat Sheet","slug":"data-lakes-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/285910"}},{"articleId":208207,"title":"Records Management For Dummies Cheat Sheet","slug":"records-management-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/208207"}},{"articleId":193380,"title":"Selecting the Correct SAS Product","slug":"selecting-the-correct-sas-product","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/193380"}},{"articleId":173434,"title":"Appraising Records and Managing Retention Scheduling","slug":"appraising-records-and-managing-retention-scheduling","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173434"}},{"articleId":173432,"title":"Managing Records on Local and Network Drives","slug":"managing-records-on-local-and-network-drives","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173432"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281857,"slug":"sas-for-dummies-2nd-edition","isbn":"9780470539682","categoryList":["technology","information-technology","data-science","databases"],"amazon":{"default":"https://www.amazon.com/gp/product/0470539682/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/0470539682/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/0470539682-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/0470539682/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/0470539682/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/sas-for-dummies-2nd-edition-cover-9780470539682-203x255.jpg","width":203,"height":255},"title":"SAS For Dummies, 2nd Edition","testBankPinActivationLink":"","bookOutOfPrint":false,"authorsInfo":"\n <p><b data-author-id=\"9190\">Stephen McDaniel</b> is Principal and cofounder of Freakalytics™ LLC, which provides training and consulting for data presentation, visual data exploration, and dashboard development. <b data-author-id=\"9191\">Chris Hemedinger</b> works in SAS R&amp;D on the team that builds SAS Enterprise Guide, a popular user interface for SAS customers. </p>","authors":[{"authorId":9190,"name":"Stephen McDaniel","slug":"stephen-mcdaniel","description":"Stephen McDaniel is principal and cofounder of Freakalytics LLC, which provides training and consulting for data presentation, visual data exploration, and dashboard development. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9190"}},{"authorId":9191,"name":"Chris Hemedinger","slug":"chris-hemedinger","description":"Chris Hemedinger works in SAS R&D on the team that builds SAS Enterprise Guide, a popular user interface for SAS customers. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9191"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;databases&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9780470539682&quot;]}]\" id=\"du-slot-621eb3be932f3\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;databases&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9780470539682&quot;]}]\" id=\"du-slot-621eb3be93cb3\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":193380,"title":"Selecting the Correct SAS Product","slug":"selecting-the-correct-sas-product","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/193380"}},{"articleId":143291,"title":"SAS Procedures and Their Location in SAS Enterprise Guide","slug":"sas-procedures-and-their-location-in-sas-enterprise-guide","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/143291"}}],"content":[{"title":"Selecting the correct SAS product","thumb":null,"image":null,"content":"<p>SAS Institute offers hundreds of SAS products, and sometimes it&#8217;s difficult to decide which tool you should use for your work. Here is a <i>partial</i> list of SAS products you might encounter and who uses them for what purpose. As a SAS customer, you might use just one of these products or a few of them; if you&#8217;re really lucky, you might use them all.</p>\n<h2>SAS, or the SAS System</h2>\n<p>The SAS System is the original SAS product that customers have used in one form or another for more than 30 years, on systems ranging from big mainframes to laptops. It&#8217;s also known as Display Manager (the name of the windowing interface), or Base SAS, or just plain old SAS. The SAS System is primarily a tool for people comfortable with writing SAS programs. It contains the data processing and analytics engine that is at the core of most SAS products.</p>\n<h2>SAS Enterprise Guide</h2>\n<p>SAS Enterprise Guide provides a modern, easy-to-use interface to much of the power of SAS. SAS Enterprise Guide is used by SAS programmers, business analysts (who might or might not have programming skills), and statisticians. It’s a Microsoft Windows application that can connect to SAS; you can use it to drive the SAS analytics engine running on a mainframe, UNIX, or other remote machines as a server application. SAS Enterprise Guide is like a general store for SAS, where you can get a little bit of everything that SAS has to offer.</p>\n<h2>SAS Data Integration Studio</h2>\n<p>SAS Data Integration Studio is used to create and maintain data warehouses and data marts, which are specialized stores of data that have been prepared for effective reporting and analytics. Data experts, such as database administrators and IT specialists — people who support other folks who have to create reports — use SAS Data Integration Studio. Like SAS Enterprise Guide, this is a client application that runs on your desktop and provides an intuitive user interface, but it can connect to SAS and databases that run on machines all over your organization.</p>\n<h2>SAS Enterprise Miner</h2>\n<p>SAS Enterprise Miner is used for data mining, or investigating patterns in large amounts of data. Statisticians and professional modelers use SAS Enterprise Miner to segment data and create descriptive or predictive models. For example, a bank might use such a model to predict how likely you are to respond to a certain credit card offering. If your data profile is similar enough to others who have responded to similar offers, SAS Enterprise Miner would produce a model that indicates you&#8217;re worth sending the offer to. Hello, Platinum card!</p>\n<h2>SAS Add-In for Microsoft Office</h2>\n<p>Some people spend most of their working days working with a Microsoft Office application such as Excel or PowerPoint. SAS Add-In for Microsoft Office lets you open SAS data sources and run SAS analyses without ever having to leave the comfy world of your spreadsheet or slideshow. SAS Add-In for Microsoft Office is used by business analysts who don&#8217;t really need to know anything about SAS programming but need the answers that SAS can provide.</p>\n<h2>SAS Web Report Studio</h2>\n<p>All business intelligence software vendors must have a Web-based reporting product, and SAS Web Report Studio fits that bill. SAS Web Report Studio allows you to create and distribute reports to anyone who needs them, all without leaving your Web browser.</p>\n<h2>SAS Forecast Studio</h2>\n<p>SAS Forecast Studio analyzes time-based data and forecasting future trends and events. It&#8217;s like a crystal ball, only better! SAS Forecast Studio is used by professional modelers or statisticians who understand concepts such as seasonality and intermittent demand models. However, no SAS programming is required!</p>\n<h2>JMP</h2>\n<p>JMP is a standalone, highly visual analytics product. It runs on Microsoft Windows, Apple Macintosh, or Linux-based computers. JMP is sometimes packaged with SAS and can work with other SAS products, but most often it&#8217;s used by researchers, engineers, and quality-control experts who want advanced analytics without a big software footprint.</p>\n"},{"title":"SAS procedures and locations in the SAS Enterprise Guide","thumb":null,"image":null,"content":"<p>The tasks in SAS Enterprise Guide and SAS Add-In for Microsoft Office cover a wide range of SAS capabilities.</p>\n<p>These SAS tasks are easy-to-use interfaces that create SAS programs to do their work. The programs call on SAS procedures, where each procedure represents a specialized capability. And to make it even more interesting, SAS (the company) provides these procedures in product bundles, which must be separately licensed for customers to use them.</p>\n<p>With all of this at work, it can be tricky to keep track of which tasks use which procedures that need which SAS products.</p>\n<p>The following table lays it all out for you: all the tasks, how to find them in the menu, which procedures they use, and which SAS products are involved.</p>\n<table>\n<tbody>\n<tr>\n<th>Task<br />\nand<br />\nMenu Location</th>\n<th>SAS Procedures<br />\nand<br />\nSAS Products</th>\n</tr>\n<tr>\n<td>Import Data<br />\n(Excel)File→Import Data</td>\n<td>DATA step; IMPORT</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Open<br />\nInformation MapFile→Open</td>\n<td>DATA step</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Linear Models</p>\n<p>Tasks→ANOVA</td>\n<td>GLM</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Mixed Models</p>\n<p>Tasks→ANOVA</td>\n<td>MIXED</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Nonparametric<br />\nOne-Way<br />\nANOVATasks→ANOVA</td>\n<td>NPAR1WAY</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>One-Way<br />\nANOVATasks→ANOVA</td>\n<td>ANOVA; GPLOT</p>\n<p>SAS/STAT; SAS/GRAPH</td>\n</tr>\n<tr>\n<td>t Test</p>\n<p>Tasks→ANOVA</td>\n<td>TTEST</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>CDF Plots</p>\n<p>Tasks&#8211;<br />\n&gt;Capability</td>\n<td>CAPABILITY</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>Histograms</p>\n<p>Tasks&#8211;<br />\n&gt;Capability</td>\n<td>CAPABILITY</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>P-P Plots</p>\n<p>Tasks&#8211;<br />\n&gt;Capability</td>\n<td>CAPABILITY</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>Probability<br />\nPlotsTasks→Capability</td>\n<td>CAPABILITY</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>Q-Q Plots</p>\n<p>Tasks&#8211;<br />\n&gt;Capability</td>\n<td>CAPABILITY</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>Box Chart</p>\n<p>Tasks→Control<br />\nCharts</td>\n<td>SHEWHART</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>c Chart</p>\n<p>Tasks→Control<br />\nCharts</td>\n<td>SHEWHART</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>Individual<br />\nMeasurements<br />\nChartTasks→Control<br />\nCharts</td>\n<td>SHEWHART</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>Mean and<br />\nRange ChartTasks→Control<br />\nCharts</td>\n<td>SHEWHART</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>Mean and<br />\nStandard<br />\nDeviation ChartTasks→Control<br />\nCharts</td>\n<td>SHEWHART</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>np Chart</p>\n<p>Tasks→Control<br />\nCharts</td>\n<td>SHEWHART</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>p Chart</p>\n<p>Tasks→Control<br />\nCharts</td>\n<td>SHEWHART</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>u Chart</p>\n<p>Tasks→Control<br />\nCharts</td>\n<td>SHEWHART</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>Append Table</p>\n<p>Tasks→Data</td>\n<td>SQL</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Compare Data</p>\n<p>Tasks→Data</td>\n<td>COMPARE</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Create Format</p>\n<p>Tasks→Data</td>\n<td>FORMAT</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Data Set<br />\nAttributesTasks→Data</td>\n<td>DATASETS</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Delete Data<br />\nSets and<br />\nFormatsTasks→Data</td>\n<td>DELETE</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Download Data<br />\nFiles to PCTasks→Data</td>\n<td>SQL</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Filter and Sort</p>\n<p>Tasks→Data</td>\n<td>SQL</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Query Builder</p>\n<p>Tasks→Data</td>\n<td>SQL</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Random Sample</p>\n<p>Tasks→Data</td>\n<td>SURVEYSELECT</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Rank</p>\n<p>Tasks→Data</td>\n<td>RANK</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Sort Data</p>\n<p>Tasks→Data</td>\n<td>SORT</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Split Columns</p>\n<p>Tasks→Data</td>\n<td>TRANSPOSE</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Stack Columns</p>\n<p>Tasks→Data</td>\n<td>TRANSPOSE</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Standardize<br />\nDataTasks→Data</td>\n<td>STANDARD</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Table Analysis</p>\n<p>Tasks→Data</td>\n<td>FREQ</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Transpose</p>\n<p>Tasks→Data</td>\n<td>TRANSPOSE</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Upload Data to<br />\nSAS ServerTasks→Data</td>\n<td>SQL</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Import from<br />\nSPSSTasks→Data</td>\n<td>IMPORT</p>\n<p>SAS/ACCESS<br />\nInterface to PC File<br />\nFormats</td>\n</tr>\n<tr>\n<td>Import from JMP</p>\n<p>Tasks→Data</td>\n<td>IMPORT</p>\n<p>SAS/ACCESS<br />\nInterface to PC File<br />\nFormats</td>\n</tr>\n<tr>\n<td>Import from<br />\nStataTasks→Data</td>\n<td>IMPORT</p>\n<p>SAS/ACCESS<br />\nInterface to PC File<br />\nFormats</td>\n</tr>\n<tr>\n<td>Characterize Data Wizard</p>\n<p>Tasks→Describe</td>\n<td>CONTENTS;<br />\nUNIVARIATE;<br />\nFREQSAS/STAT</td>\n</tr>\n<tr>\n<td>Distribution<br />\nAnalysisTasks→Describe</td>\n<td>UNIVARIATE</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>List Data</p>\n<p>Tasks→Describe</td>\n<td>PRINT</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>List Report</p>\n<p>Tasks→Describe</td>\n<td>REPORT</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>One-Way<br />\nFrequenciesTasks→Describe</td>\n<td>FREQ</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Summary Statistics</p>\n<p>Tasks→Describe</td>\n<td>MEANS</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Summary Tables</p>\n<p>Tasks→Describe</td>\n<td>TABULATE</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Area Plot</p>\n<p>Tasks→Graph</td>\n<td>GPLOT</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Bar Chart</p>\n<p>Tasks→Graph</td>\n<td>GCHART</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Bar-Line Chart</p>\n<p>Tasks→Graph</td>\n<td>GBARLINE</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Box Plot</p>\n<p>Tasks→Graph</td>\n<td>GPLOT</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Bubble Plot</p>\n<p>Tasks→Graph</td>\n<td>GPLOT</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Contour Plot</p>\n<p>Tasks→Graph</td>\n<td>GCONTOUR</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Create Map<br />\nFeature TableTasks→Graph</td>\n<td>GPROJECT</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Donut Chart</p>\n<p>Tasks→Graph</td>\n<td>GCHART</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Line Plot</p>\n<p>Tasks→Graph</td>\n<td>GPLOT</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Map Graph</p>\n<p>Tasks→Graph</td>\n<td>GMAP</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Pie Chart</p>\n<p>Tasks→Graph</td>\n<td>GCHART</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Radar Chart</p>\n<p>Tasks→Graph</td>\n<td>GRADAR</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Scatter Plot</p>\n<p>Tasks→Graph</td>\n<td>GPLOT; G3D</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Surface Plot</p>\n<p>Tasks→Graph</td>\n<td>G3D</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Tile Chart</p>\n<p>Tasks→Graph</td>\n<td>GTILE</p>\n<p>SAS/GRAPH</td>\n</tr>\n<tr>\n<td>Model Scoring</p>\n<p>Tasks→Model Scoring</td>\n<td>NONE</p>\n<p>SAS Enterprise Miner</td>\n</tr>\n<tr>\n<td>Canonical<br />\nCorrelationTasks→<br />\nMultivariate</td>\n<td>CANCORR</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Cluster Analysis</p>\n<p>Tasks→<br />\nMultivariate</td>\n<td>CLUSTER, TREE,<br />\nFASTCLUSSAS/STAT</td>\n</tr>\n<tr>\n<td>Correlations</p>\n<p>Tasks→<br />\nMultivariate</td>\n<td>CORR</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Discriminant Analysis</p>\n<p>Tasks→<br />\nMultivariate</td>\n<td>DISCRIM</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Factor Analysis</p>\n<p>Tasks→<br />\nMultivariate</td>\n<td>FACTOR</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Principal<br />\nComponentsTasks→Multivariate</td>\n<td>PRINCOMP</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Pareto Chart</p>\n<p>Tasks→Pareto</td>\n<td>PARETO</p>\n<p>SAS/QC</td>\n</tr>\n<tr>\n<td>Generalized Linear<br />\nModelsTasks→<br />\nRegression</td>\n<td>GENMOD</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Linear Regression</p>\n<p>Tasks→<br />\nRegression</td>\n<td>REG</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Logistic Regression</p>\n<p>Tasks→<br />\nRegression</td>\n<td>LOGISTIC</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Nonlinear<br />\nRegressionTasks→<br />\nRegression</td>\n<td>NLIN</p>\n<p>SAS/STAT</td>\n</tr>\n<tr>\n<td>Life Tables</p>\n<p>Tasks→<br />\nSurvival Analysis</td>\n<td>LIFETEST</p>\n<p>SAS/ETS</td>\n</tr>\n<tr>\n<td>Proportional<br />\nHazardsTasks→<br />\nSurvival Analysis</td>\n<td>PHREG</p>\n<p>SAS/ETS</td>\n</tr>\n<tr>\n<td>ARIMA Modeling<br />\nand ForecastingTasks→<br />\nTime Series</td>\n<td>ARIMA</p>\n<p>SAS/ETS</td>\n</tr>\n<tr>\n<td>Basic Forecasting</p>\n<p>Tasks→<br />\nTime Series</td>\n<td>FORECAST</p>\n<p>SAS/ETS</td>\n</tr>\n<tr>\n<td>Create Time<br />\nSeries DataTasks→<br />\nTime Series</td>\n<td>TIMESERIES</p>\n<p>SAS/ETS</td>\n</tr>\n<tr>\n<td>Forecast Studio<br />\nCreate Project<br />\nWizardTasks→<br />\nTime Series</td>\n<td>Forecast Studio</p>\n<p>SAS<br />\nForecast Server</td>\n</tr>\n<tr>\n<td>Forecast Studio<br />\nOpen Project WizardTasks→Time<br />\nSeries</td>\n<td>Forecast Studio</p>\n<p>SAS<br />\nForecast Server</td>\n</tr>\n<tr>\n<td>Forecast Studio<br />\nOverride WizardTasks→Time Series</td>\n<td>Forecast Studio</p>\n<p>SAS<br />\nForecast Server</td>\n</tr>\n<tr>\n<td>Prepare Time<br />\nSeries DataTasks→Time<br />\nSeries</td>\n<td>EXPAND</p>\n<p>SAS/ETS</td>\n</tr>\n<tr>\n<td>Regression<br />\nAnalysis of Panel<br />\nDataTasks→Time<br />\nSeries</td>\n<td>TSCSREG</p>\n<p>SAS/ETS</td>\n</tr>\n<tr>\n<td>Regression<br />\nAnalysis with<br />\nAutoregressive<br />\nErrorsTasks→Time<br />\nSeries</td>\n<td>AUTOREG</p>\n<p>SAS/ETS</td>\n</tr>\n<tr>\n<td>Update Library<br />\nMetadataTools</td>\n<td>METALIB</p>\n<p>Base SAS</td>\n</tr>\n<tr>\n<td>Assign Project<br />\nLibraryTools</td>\n<td>LIBNAME</p>\n<p>Base SAS</td>\n</tr>\n</tbody>\n</table>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2022-03-01T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":207503},{"headers":{"creationTime":"2021-12-08T20:38:27+00:00","modifiedTime":"2022-03-01T17:41:36+00:00","timestamp":"2022-03-01T18:01:07+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General (Data Science)","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"Microsoft Power BI For Dummies Cheat Sheet","strippedTitle":"microsoft power bi for dummies cheat sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"Here's a handy guide with important features and aspects of Microsoft Power BI. Keep it by your side as you're learning the platform.","noIndex":0,"noFollow":0},"content":"Microsoft Power BI is an enterprise-class data analytics and business intelligence platform that users connect to for data analysis, visualization, collaboration, and distribution. The platform takes a unified, scalable approach to business intelligence that enables users to gain deeper data insights while using virtually any data source available. With Power BI, you can access tools to support the entire data analysis lifecycle — from importing to transformation to visualization and collaboration.\r\n\r\nPower BI, as part of the Microsoft Power Platform, is complementary to its sister application of Power Apps (for no-code/low-code application development), Power Automate (for workflow development), and Power Virtual Agents (for chatbots.). Each of these applications works well with one another. Strong integration opportunities also exist between Microsoft 365 (Word, Excel, PowerPoint, and SharePoint) and Dynamics 365.\r\n\r\nWhen you’re looking to realize the value of your data using Microsoft applications, or even third-party applications, Power BI can provide the insights you and your organization look for at speed and scale.","description":"Microsoft Power BI is an enterprise-class data analytics and business intelligence platform that users connect to for data analysis, visualization, collaboration, and distribution. The platform takes a unified, scalable approach to business intelligence that enables users to gain deeper data insights while using virtually any data source available. With Power BI, you can access tools to support the entire data analysis lifecycle — from importing to transformation to visualization and collaboration.\r\n\r\nPower BI, as part of the Microsoft Power Platform, is complementary to its sister application of Power Apps (for no-code/low-code application development), Power Automate (for workflow development), and Power Virtual Agents (for chatbots.). Each of these applications works well with one another. Strong integration opportunities also exist between Microsoft 365 (Word, Excel, PowerPoint, and SharePoint) and Dynamics 365.\r\n\r\nWhen you’re looking to realize the value of your data using Microsoft applications, or even third-party applications, Power BI can provide the insights you and your organization look for at speed and scale.","blurb":"","authors":[{"authorId":34674,"name":"Jack A. Hyman","slug":"jack-a-hyman","description":"Jack A. Hyman is an associate professor in the Computer Information Sciences department at the University of the Cumberlands in Williamsburg, Kentucky. He is an author, consultant, and speaker specializing in security, blockchain, mobility, and usability engineering.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/34674"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General (Data Science)","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}},{"articleId":275234,"title":"Blockchain Use Cases","slug":"blockchain-use-cases","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275234"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":289740,"slug":"microsoft-power-bi-for-dummies","isbn":"9781119824879","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119824877/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119824877/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119824877-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119824877/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119824877/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/9781119824879-203x255.jpg","width":203,"height":255},"title":"Microsoft Power BI For Dummies","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"\n <p><b data-author-id=\"34674\">Jack A. Hyman</b> is an associate professor in the Computer Information Sciences department at the University of the Cumberlands in Williamsburg, Kentucky. He is an author, consultant, and speaker specializing in security, blockchain, mobility, and usability engineering.</p>","authors":[{"authorId":34674,"name":"Jack A. Hyman","slug":"jack-a-hyman","description":"Jack A. Hyman is an associate professor in the Computer Information Sciences department at the University of the Cumberlands in Williamsburg, Kentucky. He is an author, consultant, and speaker specializing in security, blockchain, mobility, and usability engineering.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/34674"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119824879&quot;]}]\" id=\"du-slot-621e5f632ce8d\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119824879&quot;]}]\" id=\"du-slot-621e5f632d3fd\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":0,"title":"","slug":null,"categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/"}}],"content":[{"title":"Getting to know the Power BI versions","thumb":null,"image":null,"content":"<p>When you want to select the version of Power BI that’s right for you, it might become a bit perplexing. That’s because several versions of Power BI are available — some free, some not so free.</p>\n<p>When referring to free versus paid, you should always associate this with Power BI Services, the online platform for Power BI. Power BI Desktop is always free. Let’s start with the licensing models:</p>\n<table>\n<tbody>\n<tr>\n<td width=\"175\"><strong>Version</strong></td>\n<td width=\"357\"><strong>Description</strong></td>\n</tr>\n<tr>\n<td width=\"175\"><strong>Power BI Services Free</strong></td>\n<td width=\"357\">A user can access content in My Workspace.</td>\n</tr>\n<tr>\n<td width=\"175\"><strong>Power BI Services Pro</strong></td>\n<td width=\"357\">A user can publish content to other workspaces, share dashboards, and subscribe to dashboards and reports. Users can share only with users who have a Pro or Premium license.</td>\n</tr>\n<tr>\n<td width=\"175\"><strong>Power BI Services Premium Pay-per-User</strong></td>\n<td width=\"357\">A user can publish content to other workspaces, share dashboards, and subscribe to dashboards and reports. Users can share with users who have a Pro or Premium license. The main difference here is the number of refreshes and capacity allotted at the user level. This option is intended for individual users with big data-oriented sets.</td>\n</tr>\n<tr>\n<td width=\"175\"><strong>Power Bi Services Premium Pay-per-Capacity</strong></td>\n<td width=\"357\">It has the same features as Pay-per-User, except that the scale and size of the storage repository and refresh rate are even higher than the pay-per-user model.</td>\n</tr>\n</tbody>\n</table>\n<p>It may get a little confusing because some of the products advertised are free and others require licensing. The versions of Power BI where production-ready data can be exposed to users include Free, Pro, Premium, Mobile, Embedded, and Report Server. Here’s a description of each one:</p>\n<table>\n<tbody>\n<tr>\n<td><strong>Product Version</strong></td>\n<td><strong>Description</strong></td>\n</tr>\n<tr>\n<td><strong>Desktop</strong></td>\n<td>The free desktop version of Power BI allows a user to author reports and data analytics inputs without publishing them to the Internet. If you want to collaborate or share Desktop output, you need the Pro or Premium version.</td>\n</tr>\n<tr>\n<td><strong>Free</strong></td>\n<td>Considered the entry-level free Cloud version, it lets you author and store reports online versus the desktop. The only drawback is its limited storage capacity and no opportunities for collaboration.</td>\n</tr>\n<tr>\n<td><strong>Pro</strong></td>\n<td>In the entry-level paid version of Power BI, you get a larger storage allocation and the ability to collaborate with Pro licensed users.</td>\n</tr>\n<tr>\n<td><strong>Premium</strong></td>\n<td>The enterprise paid version comes in two editions: per user and capacity. Per-user licensing is intended for those with big data aspirations who also need massive storage scale without the global distribution requirements. Capacity is useful for an enterprise that intends to have many users. There is one catch with Capacity licensing: You also need to obtain Pro licenses. What you’re paying for is the storage and security — which is the killer feature.</td>\n</tr>\n<tr>\n<td><strong>Mobile</strong></td>\n<td>Intended to be a complementary product to manage reports, dashboards, and KPIs on the go, it offers limited, if any, authoring capabilities. Your ability to collaborate on mobile varies depending on your license authorization.</td>\n</tr>\n<tr>\n<td><strong>Embedded</strong></td>\n<td>This is a way to integrate real-time reports on public- or private-facing products using the Power BI API service in Microsoft Azure.</td>\n</tr>\n<tr>\n<td><strong>Report Server</strong></td>\n<td>This server-based Power BI product is intended to produce reporting output offline. Users store their reports on a server, not online.</td>\n</tr>\n</tbody>\n</table>\n<p class=\"article-tips remember\">Power BI Desktop and Power BI Free cost nothing, but you have little to no opportunities for collaboration. If you want to collaborate, you need to purchase at least a Power BI Pro license.</p>\n"},{"title":"The basics of how to import and transform data in Power BI","thumb":null,"image":null,"content":"<p>Whether you’re using Power BI Desktop or Power BI Services, you want to know the various ways to import data so that you can produce visualizations, reports, dashboards, and KPIs after the modeling and transformation activities are complete.</p>\n<p>You can use one of four storage modes in Power BI:</p>\n<ul>\n<li><strong>Direct Import:</strong> You can import the data locally, which allows for data caching. When you ingest the model, a user can employ all Desktop features available with Power BI.</li>\n<li><strong>DirectQuery:</strong> You can create a connection to the data source. In this mode, the data isn’t cached. Instead, the source must be queried each time a data call is made. Most data sources support DirectQuery. When you’re looking for a choice for big data, consider DirectQuery. However, if you require flexibility, steer clear.</li>\n<li><strong>Live Connection: </strong>You can ingest data from the SQL Server Analysis Services in connection with Power BI Desktop or Power BI Services. Live Connection supports calculation-based activities that occur within a data model.</li>\n<li><strong>Composite Models: </strong>When you need to combine the best of Direct Import and DirectQuery or fulfill the requirement to connect to several DirectQuery connections, the composite model is one to consider.</li>\n</ul>\n<p>Regardless of the method you choose, the import process starts in Power BI Desktop when you go to the Home tab. You’ll find various ways to ingest data. The bulk of your data sources can be found using the Get Data contextual menu in the Data area of the Ribbon&#8217;s Home tab, shown below.</p>\n<p><img loading=\"lazy\" class=\"alignnone size-full wp-image-289746\" src=\"https://www.dummies.com/wp-content/uploads/9781119824879-CS01.jpg\" alt=\"MS Power BI ribbon\" width=\"535\" height=\"62\" /></p>\n<p>If you want to avoid using Power BI Desktop altogether and import an existing data set you have or perhaps even create one online, you can carry out this task in one of three ways — Power BI gives you that option.</p>\n<p>Start by clicking the Create icon (the plus sign [+] on the navigation bar along the left side; see the following figure), and then either create a new dataset online (using the Paste Manually Enter Data option) or publish an existing dataset to Power BI Services (using the Pick a Published Dataset option.)</p>\n<p><img loading=\"lazy\" class=\"alignnone size-full wp-image-289749\" src=\"https://www.dummies.com/wp-content/uploads/9781119824879-CS02-1.jpg\" alt=\"Power BI create icon\" width=\"535\" height=\"267\" /></p>\n"},{"title":"Learning your visualization options in Power BI","thumb":null,"image":null,"content":"<p>You’ve gone ahead and transformed and modeled the data in Power BI. Perhaps you didn’t use Power BI to complete the data cleansing and transformation process. Still, you have a large dataset that you want to visualize. Your endgame is to bring any sort of dataset into Power BI Services to create reports, dashboards, and perhaps a few KPIs.</p>\n<p>You can create reports and KPI’s using Power BI Desktop and then publish them to a Power BI workspace. Alternatively, you can produce visualizations directly from your dataset in Power BI Services. Either way, your end goal is to produce a type of report that’s shareable (assuming that it’s more than just you accessing the report).</p>\n<p>In any of these scenarios, you start with accessing a report. In Power BI Desktop, visualizing your data all happens on the Reports tab. (See the following figure.)</p>\n<p><img loading=\"lazy\" class=\"alignnone size-full wp-image-289750\" src=\"https://www.dummies.com/wp-content/uploads/9781119824879-CS03.jpg\" alt=\"Power BI reports tab\" width=\"200\" height=\"270\" /></p>\n<p>&nbsp;</p>\n<p class=\"article-tips remember\">If you go the Power BI Services route rather than work with Desktop, you&#8217;d create a new report in a workspace.</p>\n<p>Assuming that you have associated a dataset to a report, you have fields in the Fields pane, visualization options in the Visualizations pane, and the ability to craft your report in the Filters pane.</p>\n<p>You can filter fields (as shown for the term Recipients) when the dataset has hundreds of fields. You tighten your queries in the Filters pane.</p>\n<p><img loading=\"lazy\" class=\"alignnone size-full wp-image-289751\" src=\"https://www.dummies.com/wp-content/uploads/9781119824879-CS04.jpg\" alt=\"Power BI filters\" width=\"535\" height=\"322\" /></p>\n<p>Power BI includes more than 20 out-of-the-box visualization options. Many more can be downloaded from the Microsoft website. The visualization options run the gamut from bar charts, pie charts, treemaps, tables, KPIs, and matrixes.</p>\n<p><img loading=\"lazy\" class=\"alignnone size-full wp-image-289752\" src=\"https://www.dummies.com/wp-content/uploads/9781119824879-CS05.jpg\" alt=\"Power BI visualizations\" width=\"350\" height=\"213\" /></p>\n"},{"title":"Know the nuts and bolts of DAX for Power BI","thumb":null,"image":null,"content":"<p>You can accomplish 95 percent of your work using the low-code, no-code capabilities integrated into one of the versions of Power BI. At times, however, you may require a granular manipulation of data or a deeper dive into your data.</p>\n<p>When you want to make sophisticated calculations with little to no effort using a structured approach, turn to the syntax language Data Analysis Expressions (DAX).</p>\n<p>DAX consists of formulas and expressions used for data manipulation in data analysis tools such as Power BI. Functions, formulas, constants, and operators are used as part of DAX to create expressions that are easily entered using the editor tools in Power Query, which is the data transformation utility within the Power BI suite. (See the following figure.)</p>\n<p><img loading=\"lazy\" class=\"alignnone size-full wp-image-289753\" src=\"https://www.dummies.com/wp-content/uploads/9781119824879-CS06.jpg\" alt=\"Power BI power query\" width=\"535\" height=\"93\" /></p>\n<p class=\"article-tips tip\">If you’ve ever used Microsoft Excel formulas and calculations, you soon notice that DAX is merely an advanced version with sophisticated data manipulation capabilities targeted at business intelligence and data modeling tools.</p>\n<p>DAX combines three fundamental concepts: syntax, context, and functions. Each time you use the Power Query editor, you need to apply specific rules to create formulaic expressions to create more precise calculations or manipulate datasets. So, what are syntax, context, and functions exactly?</p>\n<ul>\n<li><strong>Syntax </strong>refers to the components within the formula you make. It’s the language used in the formula, such as the command, sign, operators, column or row, or tables. In other words, syntax is the programmatic structure.</li>\n<li><strong>Context </strong>refers to the target row incorporated into the formula for retrieval or calculation. You need to know two types to be literate in Power BI DAX: row and filter context.</li>\n<li><strong>Functions </strong>refer to the predefined and known commands in a system. These are the commands that are readily used to manipulate data without having to craft extended coding samples. More than 250 functions are available for DAX in Power BI, which is well beyond what other Microsoft solutions offer.</li>\n</ul>\n<p class=\"article-tips tip\">Be consistent with your naming conventions and data formatting. Otherwise, it can become overly complex to update your DAX formulas.</p>\n"},{"title":"Look for these Power BI service-only features","thumb":null,"image":null,"content":"<p>Power BI Desktop is useful when you want to work through datasets on your own. The second that you want to share and collaborate with others, you need to begin publishing your work from the Desktop to Power BI Services, the online platform for Power BI.</p>\n<p>The main reason you use the online version is to allow others to view your deliverables. That can be accomplished using the Power BI Services Free version. When you decide, though, to allow users to share and collaborate in Power BI, you need to leverage the workspace.</p>\n<p>Power BI has two types of workspaces: a project workspace and a personal workspace. The Power BI workspace contains all content specific to an app. When designers create an app, they bundle all the content assets necessary for use and deployment. The content might include anything from datasets, dashboards, and reports.</p>\n<p>Whereas the project workspace is intended for sharing and collaboration using a collaboration scheme with others, My Workspace is similar to Power BI Desktop. The only difference is that you control your self-created assets in My Workspace. In contrast, project workspaces can also be managed by others.</p>\n<p class=\"article-tips remember\">You need to have the correct type of license to access another user&#8217;s workspace for collaboration and sharing.</p>\n<p>The workspace isn’t the only feature that makes Power BI Services worth the investment. You get prebuilt reporting capabilities, including several that leverage Microsoft&#8217;s impressive artificial intelligence infrastructure:</p>\n<ul>\n<li><strong>Access management:</strong> Control who has access to which reports, dashboards, and datasets within a workspace.</li>\n<li><strong>Create mobile-ready Power BI apps:</strong> Packages content specific to your project in an app. Content may include reports, dashboards, and datasets.</li>\n<li><strong>Quick Insights:</strong> A cloud-only Power BI solution helps you analyze datasets and find patterns, trends, and outliers.</li>\n<li><strong>Analyze in Excel:</strong> If your data is a bit too complex to deal with in Power BI and you want to review a smaller subset with a familiar business productivity tool, Power BI integrates with Excel to allow an analyst to view and interact with their data using PivotTables, charts, and slicers.</li>\n<li><strong>Usage metrics reports:</strong> Treat this one as a status check on how your content is viewed or shared across platforms.</li>\n<li><strong>Paginated reports:</strong> A built-in, online-only option that allows users to create print-friendly reports for sharing and distribution. These reports are static because they’re fixed for print-ready presentation.</li>\n<li><strong>Data lineage:</strong> Want to know what data sources are used for what reports and datasets? If you want an upstream and downstream review of your data, data lineage offers a 360-degree view of your data chronology.</li>\n</ul>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2021-12-08T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":289744},{"headers":{"creationTime":"2017-04-24T17:12:45+00:00","modifiedTime":"2022-02-28T16:54:19+00:00","timestamp":"2022-02-28T18:01:08+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General (Data Science)","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"Algorithms For Dummies Cheat Sheet","strippedTitle":"algorithms for dummies cheat sheet","slug":"algorithms-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"In this Cheat Sheet, you'll find helpful tips for using algorithms and information about the programming languages you'll need.","noIndex":0,"noFollow":0},"content":"Algorithms are fun! Algorithms are beautiful! Algorithms are even better than your favorite pastime! Well, perhaps not the last one. In fact, algorithms surround you in many ways you might not have thought about, and you use them every day to perform important tasks.\r\n\r\nHowever, you need to be able to use algorithms in a way that doesn’t involve becoming a mathematician. Programming languages make it possible to describe the steps used to create an algorithm, and some languages are better than others at performing this task so that people can understand it without becoming a computer or data scientists. Python makes using algorithms easier because it comes with a lot of built-in and extended support (through the use of packages, datasets, and other resources).\r\n\r\nWith that in mind, this Cheat Sheet helps you access the most commonly needed tips for making your use of algorithms fast and easy.","description":"Algorithms are fun! Algorithms are beautiful! Algorithms are even better than your favorite pastime! Well, perhaps not the last one. In fact, algorithms surround you in many ways you might not have thought about, and you use them every day to perform important tasks.\r\n\r\nHowever, you need to be able to use algorithms in a way that doesn’t involve becoming a mathematician. Programming languages make it possible to describe the steps used to create an algorithm, and some languages are better than others at performing this task so that people can understand it without becoming a computer or data scientists. Python makes using algorithms easier because it comes with a lot of built-in and extended support (through the use of packages, datasets, and other resources).\r\n\r\nWith that in mind, this Cheat Sheet helps you access the most commonly needed tips for making your use of algorithms fast and easy.","blurb":"","authors":[{"authorId":9109,"name":"John Paul Mueller","slug":"john-paul-mueller","description":"John Paul Mueller has written more than 100 books and more than 600 articles on topics ranging from functional programming techniques to application development using C++. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9109"}},{"authorId":9110,"name":"Luca Massaron","slug":"luca-massaron","description":"Luca Massaron is a Google developer expert in machine learning. Massaron is a data scientist and marketing research director specializing in multivariate statistical analysis, machine learning, and customer insight.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9110"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General (Data Science)","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[{"articleId":242511,"title":"Keeping Greedy Algorithms under Control","slug":"keeping-greedy-algorithms-control","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/242511"}},{"articleId":242501,"title":"Greedy Algorithms","slug":"greedy-algorithms","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/242501"}},{"articleId":242494,"title":"Counting Objects in a Data Stream","slug":"counting-objects-data-stream","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/242494"}},{"articleId":242486,"title":"How to Find the Number of Elements in a Data Stream","slug":"find-number-elements-data-stream","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/242486"}},{"articleId":242478,"title":"Elements Added to Bloom Filters","slug":"elements-added-bloom-filters","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/242478"}}],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281625,"slug":"algorithms-for-dummies","isbn":"9781119869986","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119869986/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119869986/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119869986-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119869986/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119869986/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/9781119869986-203x255.jpg","width":203,"height":255},"title":"Algorithms For Dummies, 2nd Edition","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"\n <p><b data-author-id=\"9109\">John Paul Mueller</b> has written more than 100 books and more than 600 articles on topics ranging from functional programming techniques to application development using C++. <b data-author-id=\"9110\">Luca Massaron</b> is a Google developer expert in machine learning. Massaron is a data scientist and marketing research director specializing in multivariate statistical analysis, machine learning, and customer insight.</p>","authors":[{"authorId":9109,"name":"John Paul Mueller","slug":"john-paul-mueller","description":"John Paul Mueller has written more than 100 books and more than 600 articles on topics ranging from functional programming techniques to application development using C++. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9109"}},{"authorId":9110,"name":"Luca Massaron","slug":"luca-massaron","description":"Luca Massaron is a Google developer expert in machine learning. Massaron is a data scientist and marketing research director specializing in multivariate statistical analysis, machine learning, and customer insight.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/9110"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119869986&quot;]}]\" id=\"du-slot-621d0de42ea7f\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119869986&quot;]}]\" id=\"du-slot-621d0de42f439\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":238382,"title":"Locating the Algorithm You Need","slug":"locating-algorithm-need","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238382"}},{"articleId":238386,"title":"Differentiating Algorithms from Other Math Structures","slug":"differentiating-algorithms-math-structures","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238386"}},{"articleId":238390,"title":"Amazing Ways to Use Algorithms","slug":"amazing-ways-use-algorithms","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238390"}},{"articleId":238394,"title":"Dealing with Algorithm Complexity","slug":"dealing-algorithm-complexity","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238394"}}],"content":[{"title":"Locating the algorithm you need ","thumb":null,"image":null,"content":"<p>The following table describes algorithms and algorithm types that you might find useful for various types of data analysis.</p>\n<table>\n<tbody>\n<tr>\n<td width=\"103\">Algorithm</td>\n<td width=\"282\">Description</td>\n<td width=\"147\">Helpful URL</td>\n</tr>\n<tr>\n<td width=\"103\">A * Search</td>\n<td width=\"282\">The algorithm tracks the cost of nodes as it explores them using the equation: f(n) = g(n) + h(n), where:</p>\n<p>n is the node identifier</p>\n<p>g(n) is the cost of reaching the node so far</p>\n<p>h(n) is the estimated cost to reach the goal from the node</p>\n<p>f(n) is the estimated cost of the path from n to the goal</p>\n<p>The idea is to search the most promising paths first and avoid expensive paths.</td>\n<td width=\"147\"><a href=\"http://theory.stanford.edu/~amitp/GameProgramming/AStarComparison.html\">http://theory.stanford.edu/~amitp/GameProgramming/AStarComparison.html</a></td>\n</tr>\n<tr>\n<td width=\"103\">Balanced Tree</td>\n<td width=\"282\">A kind of tree that maintains a balanced structure through reorganization so that it can provide reduced access times. The tree’s height is always O(log N), where N is the number of nodes.</td>\n<td width=\"147\"><a href=\"https://webdocs.cs.ualberta.ca/~holte/T26/balanced-trees.html\">https://webdocs.cs.ualberta.ca/~holte/T26/balanced-trees.html</a></td>\n</tr>\n<tr>\n<td width=\"103\">Bellman-Ford</td>\n<td width=\"282\">This algorithm is used similarly to Dijikstra’s algorithm to find shortest paths, but it allows the use of negative weights. It’s also simpler than the Dijikstra algorithm. The cost for using this algorithm is time, with a time complexity of O(VE) versus O((V+E)LogV) for the Dijikstra algorithm.</td>\n<td width=\"147\"><a href=\"https://www.geeksforgeeks.org/bellman-ford-algorithm-dp-23/\">https://www.geeksforgeeks.org/bellman-ford-algorithm-dp-23/</a></td>\n</tr>\n<tr>\n<td width=\"103\">Bidirectional Search</td>\n<td width=\"282\">This technique searches simultaneously from the root node and the goal node until the two search paths meet in the middle. An advantage of this approach is that it’s time efficient because it finds the solution faster than many other brute-force solutions. In addition, it uses memory more efficiently than other approaches and always finds a solution. The main disadvantage is complexity of implementation.</td>\n<td width=\"147\"><a href=\"http://planning.cs.uiuc.edu/node50.html\">http://planning.cs.uiuc.edu/node50.html</a></td>\n</tr>\n<tr>\n<td width=\"103\">Binary Tree</td>\n<td width=\"282\">This is a type of tree containing nodes that connect to zero (leaf nodes), one, or two (branch nodes) other nodes. Each node defines the three elements that it must include to provide connectivity and store data: data storage, left connection, and right connection.</td>\n<td width=\"147\"><a href=\"https://www.geeksforgeeks.org/binary-tree-data-structure/\">https://www.geeksforgeeks.org/binary-tree-data-structure/</a></td>\n</tr>\n<tr>\n<td width=\"103\">Breadth-First Search</td>\n<td width=\"282\">This technique begins at the root node, explores each of the child nodes first, and only then moves down to the next level. It progresses level by level until it finds a solution. The disadvantage of this algorithm is that it must store every node in memory, which means that it uses a considerable amount of memory for a large number of nodes. This technique can check for duplicate nodes, which saves time, and it always comes up with a solution.</td>\n<td width=\"147\"><a href=\"https://www.khanacademy.org/computing/computer-science/algorithms/breadth-first-search/a/the-breadth-first-search-algorithm\">https://www.khanacademy.org/computing/computer-science/algorithms/breadth-first-search/a/the-breadth-first-search-algorithm</a></td>\n</tr>\n<tr>\n<td width=\"103\">Brute Force</td>\n<td width=\"282\">This is a technique of problem solving in which someone tries every possible solution, looking for the best problem solution. Brute-force techniques do guarantee a best-fit solution when one exists but are so time consuming to implement that most people avoid them for large problems.</td>\n<td width=\"147\"><a href=\"http://www-igm.univ-mlv.fr/~lecroq/string/node3.html\">http://www-igm.univ-mlv.fr/~lecroq/string/node3.html</a></td>\n</tr>\n<tr>\n<td width=\"103\">Depth-First Search</td>\n<td width=\"282\">This technique begins at the root node and explores a set of connected child nodes until it reaches a leaf node. It progresses branch by branch until it finds a solution. The disadvantage of this algorithm is that it can’t check for duplicate nodes, which means that it could traverse the same node paths more than once. In fact, this algorithm may not find a solution at all (for a random or heuristic DFS search), which means that you must define a cutoff point to keep the algorithm from searching infinitely. An advantage of this approach is that it’s memory efficient.</td>\n<td width=\"147\"><a href=\"https://www.hackerearth.com/practice/algorithms/graphs/depth-first-search/tutorial/\">https://www.hackerearth.com/practice/algorithms/graphs/depth-first-search/tutorial/</a></td>\n</tr>\n<tr>\n<td width=\"103\">Divide and Conquer</td>\n<td width=\"282\">This is a technique of problem solving in which the problem is broken into the smallest possible pieces and solved using the simplest approach possible. This technique saves considerable time and resources when compared to other approaches, such as brute force. However, it doesn’t always guarantee a best-fit result.</td>\n<td width=\"147\"><a href=\"https://www.khanacademy.org/computing/computer-science/algorithms/merge-sort/a/divide-and-conquer-algorithms\">https://www.khanacademy.org/computing/computer-science/algorithms/merge-sort/a/divide-and-conquer-algorithms</a></td>\n</tr>\n<tr>\n<td width=\"103\">Dijikstra</td>\n<td width=\"282\">This is an algorithm used for finding the shortest path in a directed, weighted (having positive weights) graph.</td>\n<td width=\"147\"><a href=\"https://www.geeksforgeeks.org/dijkstras-shortest-path-algorithm-greedy-algo-7/\">https://www.geeksforgeeks.org/dijkstras-shortest-path-algorithm-greedy-algo-7/</a></td>\n</tr>\n<tr>\n<td width=\"103\">Floyd-Warshall</td>\n<td width=\"282\">The Floyd-Warshall is similar to the Dijikstra algorithm in that it returns the shortest path between two points. However, it differs from Dijikstra and Bellman-Ford in that it returns the distance of every node with respect to all the other nodes present in the graph, and it’s relatively efficient in doing so, but it’s the slowest of the three. In some cases, Dijkstra’s and Bellman-Ford’s algorithms can produce the same result as the Floyd-Warshall algorithm, but they require longer execution times and more computations. The cost for using this algorithm is a time complexity of O(V3) versus O((V+E)LogV) for the Dijikstra algorithm.</td>\n<td width=\"147\"><a href=\"https://www.geeksforgeeks.org/floyd-warshall-algorithm-dp-16/\">https://www.geeksforgeeks.org/floyd-warshall-algorithm-dp-16/</a></td>\n</tr>\n<tr>\n<td width=\"103\">Graph</td>\n<td width=\"282\">A graph is a sort of a tree extension. As with trees, you have nodes that connect to each other to create relationships. However, unlike binary trees, a graph can have more than one or two connections. In fact, graph nodes often have a multitude of connections. You see graphs used in places like maps for GPS and all sorts of other places for which the top-down approach of a tree won’t work.</td>\n<td width=\"147\"><a href=\"https://www.tutorialspoint.com/data_structures_algorithms/graph_data_structure.htm\">https://www.tutorialspoint.com/data_structures_algorithms/graph_data_structure.htm</a></td>\n</tr>\n<tr>\n<td width=\"103\">Greedy Algorithms</td>\n<td width=\"282\">This technique of one of problem solving in which the solution relies on the best answer for every step of the problem-solving process. Greedy algorithms generally make two assumptions:</p>\n<p>Making a single optimal choice at a given step is possible.</p>\n<p>By choosing the optimal selection at each step, finding an optimal solution for the overall problem is possible.</td>\n<td width=\"147\"><a href=\"https://www.tutorialspoint.com/data_structures_algorithms/greedy_algorithms.htm\">https://www.tutorialspoint.com/data_structures_algorithms/greedy_algorithms.htm</a></td>\n</tr>\n<tr>\n<td width=\"103\">Greedy Best-First Search (BFS)</td>\n<td width=\"282\">The algorithm always chooses the path that is closest to the goal using the equation: f(n) = h(n). This particular algorithm can find solutions quite quickly, but it can also get stuck in loops, so many people don’t consider it an optimal approach to finding a solution.</td>\n<td width=\"147\"><a href=\"https://www.geeksforgeeks.org/best-first-search-informed-search/\">https://www.geeksforgeeks.org/best-first-search-informed-search/</a></td>\n</tr>\n<tr>\n<td width=\"103\">Hashing</td>\n<td width=\"282\">This is a method of predicting the location of a particular data item in the data structure (whatever that structure might be) before actually looking for it. This approach relies on the use of keys placed into an index. A hash function turns the key into a numeric value that the algorithm places into a hash table. A hash table provides the means to create an index that points to elements in a data structure so that an algorithm can easily predict the location of the data.</td>\n<td width=\"147\"><a href=\"https://www.tutorialspoint.com/data_structures_algorithms/hash_data_structure.htm\">https://www.tutorialspoint.com/data_structures_algorithms/hash_data_structure.htm</a></td>\n</tr>\n<tr>\n<td width=\"103\">Heap</td>\n<td width=\"282\">This is a sophisticated binary tree that allows data insertions into the tree structure. The use of data insertion makes sorting faster. You can further classify these trees as max heaps and min heaps, depending on the tree’s capability to immediately provide the maximum or minimum value present in the tree.</td>\n<td width=\"147\"><a href=\"https://www.tutorialspoint.com/data_structures_algorithms/heap_data_structure.htm\">https://www.tutorialspoint.com/data_structures_algorithms/heap_data_structure.htm</a></td>\n</tr>\n<tr>\n<td width=\"103\">Heuristics</td>\n<td width=\"282\">This is a technique of problem solving that relies on self-discovery and produces sufficiently useful results (not necessarily optimal, but good enough) to address a problem well enough that a better solution isn’t necessary. Self-discovery is the process of allowing the algorithm to show you a potentially useful path to a solution (but you must still count on human intuition and understanding to know whether the solution is the right one).</td>\n<td width=\"147\"><a href=\"https://optimization.mccormick.northwestern.edu/index.php/Heuristic_algorithms\">https://optimization.mccormick.northwestern.edu/index.php/Heuristic_algorithms</a></td>\n</tr>\n<tr>\n<td width=\"103\">MapReduce</td>\n<td width=\"282\">This is a framework for making algorithms work using computations in parallel (using multiple computers connected together in a network), allowing algorithms to complete their solutions faster.</td>\n<td width=\"147\"><a href=\"https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html\">https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html</a></td>\n</tr>\n<tr>\n<td width=\"103\">Merge Sort</td>\n<td width=\"282\">Merge sort is a general-purpose, comparison based method of sorting data. It depends on a divide-and-conquer approach to performing its task.</td>\n<td width=\"147\"><a href=\"https://www.geeksforgeeks.org/merge-sort/\">https://www.geeksforgeeks.org/merge-sort/</a></td>\n</tr>\n<tr>\n<td width=\"103\">Nash Equilibrium</td>\n<td width=\"282\">This is a game theory in which the other players know the equilibrium strategy for the other players, so no one has anything to gain by changing his or her personal strategy. This theory sees use in any hostile situation in which the player must account for the decisions made by all of the other players in order to win the game.</td>\n<td width=\"147\"><a href=\"https://corporatefinanceinstitute.com/resources/knowledge/economics/nash-equilibrium-game-theory/\">https://corporatefinanceinstitute.com/resources/knowledge/economics/nash-equilibrium-game-theory/</a></td>\n</tr>\n<tr>\n<td width=\"103\">PageRank</td>\n<td width=\"282\">PageRank is an algorithm for measuring the importance of a node in a graph. This algorithm is at the root of the Google’s core algorithms for powering relevant searches to users.</td>\n<td width=\"147\"><a href=\"https://www.semrush.com/blog/pagerank/\">https://www.semrush.com/blog/pagerank/</a></td>\n</tr>\n<tr>\n<td width=\"103\">Pure Heuristic Search</td>\n<td width=\"282\">This algorithm expands nodes in order of their cost. It maintains two lists. The closed list contains the nodes that it has already explored, and the open list contains the nodes it must yet explore. In each iteration, the algorithm expands the node with the lowest possible cost. All its child nodes are placed in the closed list and the individual child node costs are calculated. The algorithm sends the child nodes with a low cost back to the open list and deletes the child nodes with a high cost. Consequently, the algorithm performs an intelligent, cost-based search for the solution.</td>\n<td width=\"147\"><a href=\"https://www.tutorialspoint.com/artificial_intelligence/artificial_intelligence_popular_search_algorithms.htm\">https://www.tutorialspoint.com/artificial_intelligence/artificial_intelligence_popular_search_algorithms.htm</a></td>\n</tr>\n<tr>\n<td width=\"103\">Quick Sort</td>\n<td width=\"282\">This is a general-purpose sorting strategy based on partitioning arrays of data into smaller arrays. It depends on a divide-and-conquer approach to performing its task.</td>\n<td width=\"147\"><a href=\"https://www.tutorialspoint.com/data_structures_algorithms/quick_sort_algorithm.htm\">https://www.tutorialspoint.com/data_structures_algorithms/quick_sort_algorithm.htm</a></td>\n</tr>\n<tr>\n<td width=\"103\">Unbalanced Tree</td>\n<td width=\"282\">This is a tree that places new data items wherever necessary in the tree without regard to balance. This method of adding items makes building the tree faster but reduces access speed when searching or sorting.</td>\n<td width=\"147\"><a href=\"https://www.quora.com/What-is-an-unbalanced-binary-tree-and-what-are-its-uses\">https://www.quora.com/What-is-an-unbalanced-binary-tree-and-what-are-its-uses</a></td>\n</tr>\n</tbody>\n</table>\n"},{"title":"Differentiating algorithms from other math structures ","thumb":null,"image":null,"content":"<p>If you’re like most people, you often find yourself scratching your head when it comes to math structures because no one seems to know how to use the terms correctly. It’s as though people are purposely trying to make things hard! After all, what is an equation and why is it different from an algorithm?</p>\n<p>Well, fear no more: The following table provides the definitive guide to math structures that you might encounter but have been afraid to ask about.</p>\n<table>\n<tbody>\n<tr>\n<td width=\"73\">Structure</td>\n<td width=\"459\">Description</td>\n</tr>\n<tr>\n<td width=\"73\">Equation</td>\n<td width=\"459\">Numbers and symbols that, when taken as a whole, equate to a specific value. An equation always contains an equals sign so that you know that the numbers and symbols represent the specific value on the other side of the equals sign. Equations generally contain variable information presented as a symbol, but they aren’t required to use variables.</td>\n</tr>\n<tr>\n<td width=\"73\">Formula</td>\n<td width=\"459\">A combination of numbers and symbols used to express information or ideas. A formula normally presents mathematical or logical concepts, such as to define the Greatest Common Divisor (GCD) of two integers (<a href=\"https://www.khanacademy.org/math/cc-sixth-grade-math/cc-6th-factors-and-multiples/cc-6th-gcf/v/greatest-common-divisor\" target=\"_blank\" rel=\"noopener\">this Kahn Academy video</a> tells how this works). Generally, a formula shows the relationship between two or more variables. Most people see a formula as a special kind of equation.</td>\n</tr>\n<tr>\n<td width=\"73\">Algorithm</td>\n<td width=\"459\">A sequence of steps used to solve a problem. The sequence presents a unique method of addressing an issue by providing a particular solution.</p>\n<p>An algorithm need not represent mathematical or logical concepts, even though the presentations in this book often do fall into that category because people most commonly use algorithms in this manner. Some special formulas are also algorithms, such as the quadratic formula. For a process to represent an algorithm, it must be the following:</p>\n<p><strong>Finite:</strong> The algorithm must eventually solve the problem.</p>\n<p><strong>Well-defined:</strong> The series of steps must be precise and present steps that are understandable, especially by computers, which must be able to create a usable algorithm.</p>\n<p><strong>Effective:</strong> An algorithm must solve all cases of the problem for which someone defined it. An algorithm should always solve the problem it has to solve. Even though you should anticipate some failures, the incidence of failure is rare and occurs only in situations that are acceptable for the intended algorithm use.</td>\n</tr>\n</tbody>\n</table>\n"},{"title":"Amazing ways to use algorithms ","thumb":null,"image":null,"content":"<p>You have likely used an algorithm today without knowing it, as have most other people. For example, making toast is an example of an algorithm, as explained in <a href=\"http://blog.johnmuellerbooks.com/2013/03/04/procedures-in-technical-writing/\" target=\"_blank\" rel=\"noopener\">this blog post</a>. Making toast isn’t an amazing algorithm, but the ones in the following table, which use a computer to perform tasks, are.</p>\n<table>\n<tbody>\n<tr>\n<td width=\"127\">Task</td>\n<td width=\"405\">Why It’s Amazing</td>\n</tr>\n<tr>\n<td width=\"127\">Cryptography</td>\n<td width=\"405\">Keeping data safe is an ongoing battle with hackers constantly attacking data sources. Cryptography relies on algorithms to make data unreadable for transmission from one location to another, and then relies on other algorithms to convert the unreadable form back into a readable form. The most commonly used cryptographic algorithm today is Advanced Encryption Standard (AES), which you can <a href=\"https://www.tutorialspoint.com/cryptography/advanced_encryption_standard.htm\" target=\"_blank\" rel=\"noopener\">read about here</a>.</td>\n</tr>\n<tr>\n<td width=\"127\">Graph analysis</td>\n<td width=\"405\">The capability to decide on the shortest path between two points finds all sorts of uses. For example, in a routing problem, your GPS couldn’t function without this particular algorithm because it could never direct you along city streets using the shortest route from point A to point B.</td>\n</tr>\n<tr>\n<td width=\"127\">Pseudorandom number generation</td>\n<td width=\"405\">Imagine playing games that never varied. You start at the same place and perform the same steps in the same manner every time you play. Boring! Without the capability to generate seemingly random numbers, many computer tasks become pointless or impossible.</td>\n</tr>\n<tr>\n<td width=\"127\">Scheduling</td>\n<td width=\"405\">Making the use of resources fair to all concerned is another way in which algorithms make their presence known in a big way. For example, timing lights at intersections are no longer simple devices that count down the seconds between light changes. Modern devices consider all sorts of issues, such as the time of day, weather conditions, and flow of traffic.</p>\n<p>Scheduling comes in many forms, however. Consider how your computer runs multiple tasks at the same time. Without a scheduling algorithm, the operating system might grab all the available resources and keep your application from doing any useful work.</td>\n</tr>\n<tr>\n<td width=\"127\">Searching</td>\n<td width=\"405\">Locating information or verifying that the information you see is the information you want is an essential task. Without this capability, many tasks you perform online wouldn’t be possible, such as finding the website on the Internet that sells the perfect coffee pot for your office.</td>\n</tr>\n<tr>\n<td width=\"127\">Sorting</td>\n<td width=\"405\">Determining the order in which to present information is important because most people today suffer from information overload, and need to reduce the onrush of data. Imagine going to Amazon, finding more than a thousand coffee pots for sale, and yet not being able to sort them according to price or most positive review. Moreover, many complex algorithms require data in the proper order to work dependably, so sorting is an important requisite for solving more problems.</td>\n</tr>\n<tr>\n<td width=\"127\">Transforming</td>\n<td width=\"405\">Converting one kind of data to another kind of data is critical to understanding and using the data effectively. For example, you might understand imperial weights just fine, but all your sources use the metric system. Converting between the two systems helps you understand the data. Likewise, the Fast Fourier Transform (FFT) converts signals between the time domain and the frequency domain, enabling things like your Wi-Fi router to work.</td>\n</tr>\n</tbody>\n</table>\n<p>&nbsp;</p>\n"},{"title":"Dealing with algorithm complexity ","thumb":null,"image":null,"content":"<p>Time is money, which may not always mean what you think it means (see <a href=\"https://opher-ganel.medium.com/time-is-money-doesnt-mean-what-you-think-it-means-ba993723819c\" target=\"_blank\" rel=\"noopener\">this blog post by Opher Ganel</a>). However, time complexity in algorithms does generally break down into lost time, use of additional resources, and, yes, money.</p>\n<p>One way to compare two algorithms is through time complexity. You need to know how complex an algorithm is, because the more complex is the algorithm, the longer it takes to run. However, time complexity isn’t the only comparison measure. If one algorithm takes twice as long to run but produces a dependable result three times as often as another algorithm that runs in half the time, you may need to use the slower algorithm.</p>\n<p>The following table helps you understand the various levels of time complexity presented in order of running time (from fastest to slowest).</p>\n<table>\n<tbody>\n<tr>\n<td width=\"115\">Complexity</td>\n<td width=\"417\">Description</td>\n</tr>\n<tr>\n<td width=\"115\">Constant complexity O(1)</td>\n<td width=\"417\">Provides an unvarying execution time, no matter how much input you provide. Each input requires a single unit of execution time.</td>\n</tr>\n<tr>\n<td width=\"115\">Logarithmic complexity O(log n)</td>\n<td width=\"417\">The number of operations grows at a slower rate than the input, making the algorithm less efficient with small inputs and more efficient with larger ones. A typical algorithm of this class is the binary search.</td>\n</tr>\n<tr>\n<td width=\"115\">Linear complexity O(n)</td>\n<td width=\"417\">Operations grow with the input in a 1:1 ratio. A typical algorithm is iteration, when you scan input once and apply an operation to each element of it.</td>\n</tr>\n<tr>\n<td width=\"115\">Linearithmic complexity O(n log n)</td>\n<td width=\"417\">Complexity is a mix between logarithmic and linear complexity. It is typical of some smart algorithms used to order data, such as merge sort, heap sort, and quick sort.</td>\n</tr>\n<tr>\n<td width=\"115\">Quadratic complexity O(n2)</td>\n<td width=\"417\">Operations grow as a square of the number of inputs. When you have one iteration inside another iteration (called nested iterations in computer science), you have quadratic complexity. For instance, you have a list of names and, in order to find the most similar ones, you compare each name against all the other names.</p>\n<p>Some less efficient ordering algorithms present such complexity: bubble sort, selection sort, and insertion sort. This level of complexity means that your algorithms may run for hours or even days before reaching a solution.</td>\n</tr>\n<tr>\n<td width=\"115\">Cubic complexity O(n3)</td>\n<td width=\"417\">Operations grow even faster than quadratic complexity because now you have multiple nested iterations. When an algorithm has this order of complexity and you need to process a modest amount of data (100,000 elements), your algorithm may run for years. When you have a number of operations that is a power of the input, it is common to refer to the algorithm as running in polynomial time.</td>\n</tr>\n<tr>\n<td width=\"115\">Exponential complexity O(2n)</td>\n<td width=\"417\">The algorithm takes twice the number of previous operations for every new element added. When an algorithm has this complexity, even small problems may take practically forever. Many algorithms doing exhaustive searches have exponential complexity. However, the classic example for this level of complexity is the calculation of Fibonacci numbers.</td>\n</tr>\n<tr>\n<td width=\"115\">Factorial complexity O(n!)</td>\n<td width=\"417\">This algorithm presents a real nightmare of complexity because of the large number of possible combinations between the elements. Just imagine: If your input is 100 objects, and an operation on your computer takes 10<sup>-6</sup> seconds (a reasonable speed for every computer nowadays), you will need about 10<sup>140</sup> years to complete the task successfully (an impossible amount of time because the age of the universe is estimated as being 1.38*10<sup>10</sup> years).</p>\n<p>A famous factorial complexity problem is the traveling salesman problem, in which a salesman has to find the shortest route for visiting many cities and coming back to the starting city.</td>\n</tr>\n</tbody>\n</table>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2022-02-24T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":238398},{"headers":{"creationTime":"2021-09-16T15:42:53+00:00","modifiedTime":"2022-02-25T17:43:49+00:00","timestamp":"2022-02-25T18:01:14+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"Databases","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33579"},"slug":"databases","categoryId":33579}],"title":"Data Lakes For Dummies Cheat Sheet","strippedTitle":"data lakes for dummies cheat sheet","slug":"data-lakes-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"Learn what a data lake is, how to build a data lake, and how to ensure that your company's data lake supports a broad range of analytics.","noIndex":0,"noFollow":0},"content":"A data lake is an enterprise-scale home for analytical data from all corners of your company or governmental agency. No matter what your analytical data landscape looks like today, your organization will benefit from building a data lake.\r\n\r\n[caption id=\"attachment_285911\" align=\"alignnone\" width=\"556\"]<img class=\"size-full wp-image-285911\" src=\"https://www.dummies.com/wp-content/uploads/data-lakes-concept.jpg\" alt=\"conceptual graphic of a data lake\" width=\"556\" height=\"440\" /> © Stuart Miles / Shutterstock.com[/caption]","description":"A data lake is an enterprise-scale home for analytical data from all corners of your company or governmental agency. No matter what your analytical data landscape looks like today, your organization will benefit from building a data lake.\r\n\r\n[caption id=\"attachment_285911\" align=\"alignnone\" width=\"556\"]<img class=\"size-full wp-image-285911\" src=\"https://www.dummies.com/wp-content/uploads/data-lakes-concept.jpg\" alt=\"conceptual graphic of a data lake\" width=\"556\" height=\"440\" /> © Stuart Miles / Shutterstock.com[/caption]","blurb":"","authors":[{"authorId":10199,"name":"Alan Simpson","slug":"alan-simpson","description":"Alan Simpson is a web development professional who has published more than 100 articles and books on technology.","_links":{"self":"https://dummies-api.dummies.com/v2/authors/10199"}}],"primaryCategoryTaxonomy":{"categoryId":33579,"title":"Databases","slug":"databases","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33579"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[],"fromCategory":[{"articleId":208207,"title":"Records Management For Dummies Cheat Sheet","slug":"records-management-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/208207"}},{"articleId":207503,"title":"SAS For Dummies Cheat Sheet","slug":"sas-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/207503"}},{"articleId":193380,"title":"Selecting the Correct SAS Product","slug":"selecting-the-correct-sas-product","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/193380"}},{"articleId":173434,"title":"Appraising Records and Managing Retention Scheduling","slug":"appraising-records-and-managing-retention-scheduling","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173434"}},{"articleId":173432,"title":"Managing Records on Local and Network Drives","slug":"managing-records-on-local-and-network-drives","categoryList":["technology","information-technology","data-science","databases"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/173432"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":284366,"slug":"data-lakes-for-dummies","isbn":"9781119786160","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119786169/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119786169/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119786169-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119786169/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119786169/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/data-lakes-for-dummies-cover-9781119786160-203x255.jpg","width":203,"height":255},"title":"Data Lakes For Dummies","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"\n <p>Alan Simon is the managing principal of Thinking Helmet, Inc., the author of 32 books on business technology, and a consultant who's worked with enterprise and government organizations. His professional focus is business intelligence, analytics, and data warehousing. He also teaches university courses in his specialty areas.</p>","authors":[{"authorId":10511,"name":"Alan R. Simon","slug":"alan-r-simon","description":"Alan Simon is the managing principal of Thinking Helmet, Inc., the author of 32 books on business technology, and a consultant who's worked with enterprise and government organizations. His professional focus is business intelligence, analytics, and data warehousing. He also teaches university courses in his specialty areas. ","_links":{"self":"https://dummies-api.dummies.com/v2/authors/10511"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;databases&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119786160&quot;]}]\" id=\"du-slot-6219196aaac88\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;databases&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119786160&quot;]}]\" id=\"du-slot-6219196aab788\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":0,"title":"","slug":null,"categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/"}}],"content":[{"title":"Five phases to building a data lake","thumb":null,"image":null,"content":"<p>Your data lake journey begins with a thorough understanding of today’s analytics and data throughout your entire organization. Then, you’ll methodically progress through conceptual and high-level activities into your implementation activities. Follow these phases whose first letters spell out A LAKE:</p>\n<ol>\n<li>ASSESS your current state, and score the results.</li>\n<li>Prepare a LOFTY VISION for what your data lake will bring you, both technology-wise and in terms of business value.</li>\n<li>Decide on your data lake ARCHITECTURE, starting at the conceptual level and then shifting into specific products and services.</li>\n<li>Begin with your KICKOFF ACTIVITIES that will deliver the first end-to-end data pipelines that culminate in high-value analytics.</li>\n<li>Progressively EXPAND your data lake through subsequent phases.</li>\n</ol>\n"},{"title":"Three types of data for a data lake","thumb":null,"image":null,"content":"<p>If you’ve been working primarily with traditional data warehouses and data marts, you’re in for a treat. Not only will your data lake include the structured data that you’re used to working with, but you’ll also ingest, manage, and deliver:</p>\n<ul>\n<li>Semi-structured data, such as tweets, blog posts, and email messages</li>\n<li>Unstructured data, such as photos, videos, and audio files</li>\n</ul>\n<p>Your next generation of analytics will be built from the fusion of these various types of data. Sometimes, the insights you need aren’t just in the numbers or the stats, but in what you can learn from these other forms of data.</p>\n"},{"title":"Four zones inside a data lake","thumb":null,"image":null,"content":"<p>From the 30,000-foot view, your data lake appears to be a large store of all types of data. When you peel the lid back, though, your data lake should be well organized into the following zones:</p>\n<ul>\n<li><strong>The bronze zone,</strong> where you ingest your raw data into inexpensive storage that is infinitely expandable . . . or at least pretty close to infinitely expandable!</li>\n<li><strong>The silver zone,</strong> where you store your formerly raw data that is now cleansed and enriched</li>\n<li><strong>The gold zone,</strong> where you store curated packages of data that are prepared to support users and analytical needs all across your enterprise</li>\n<li><strong>The sandbox,</strong> where you can quickly place data from elsewhere in your data lake — or even new data coming in from the outside — for experimental or short-term analysis</li>\n</ul>\n"},{"title":"Supporting an entire analytics continuum","thumb":null,"image":null,"content":"<p>Your data lake will support a broad range of analytics in a coordinated, well-architected manner. Prepare to make use of:</p>\n<ul>\n<li><strong>Descriptive analytics,</strong> which tell you what happened in the past or what’s happening right now</li>\n<li><strong>Diagnostic analytics,</strong> which dig into your descriptive analytics and help you understand why something happened or is happening</li>\n<li><strong><a href=\"https://dummies-wp-content.dummies.com/programming/big-data/data-science/big-data-visualization-tools-can-use-predictive-analytics/\" target=\"_blank\" rel=\"noopener\">Predictive analytics</a>,</strong> which tell you what’s likely to happen</li>\n<li><strong>Discovery analytics,</strong> in which you turn your analytical power loose on mountains of data with a mission to tell us interesting and important patterns and other insights out of all of this data, without our asking specific questions</li>\n<li><strong>Prescriptive analytics,</strong> which take all your other categories of analytics to the last mile and guide you to decision-making, present you with alternatives for taking action, and make a recommendation for your “best” course of action</li>\n</ul>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2021-09-16T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":285910}],"_links":{"self":{"self":"https://dummies-api.dummies.com/v2/categories/33577/categoryArticles?sortField=time&sortOrder=1&size=10&offset=0"},"next":{"self":"https://dummies-api.dummies.com/v2/categories/33577/categoryArticles?sortField=time&sortOrder=1&size=10&offset=10"},"last":{"self":"https://dummies-api.dummies.com/v2/categories/33577/categoryArticles?sortField=time&sortOrder=1&size=10&offset=355"}}},"objectTitle":"","status":"success","pageType":"article-category","objectId":"33577","page":1,"sortField":"time","sortOrder":1,"categoriesIds":[],"articleTypes":[],"filterData":{"categoriesFilter":[{"itemId":0,"itemName":"All Categories","count":365},{"itemId":33578,"itemName":"Big Data","count":173},{"itemId":33579,"itemName":"Databases","count":8},{"itemId":33580,"itemName":"General (Data Science)","count":170},{"itemId":34365,"itemName":"Web Analytics","count":14}],"articleTypeFilter":[{"articleType":"All Types","count":365},{"articleType":"Articles","count":340},{"articleType":"Cheat Sheet","count":17},{"articleType":"Step by Step","count":8}]},"filterDataLoadedStatus":"success","pageSize":10},"adsState":{"pageScripts":{"headers":{"timestamp":"2022-05-16T12:59:10+00:00"},"adsId":0,"data":{"scripts":[{"pages":["all"],"location":"header","script":"<!--Optimizely Script-->\r\n<script src=\"https://cdn.optimizely.com/js/10563184655.js\"></script>","enabled":false},{"pages":["all"],"location":"header","script":"<!-- comScore Tag -->\r\n<script>var _comscore = _comscore || [];_comscore.push({ c1: \"2\", c2: \"15097263\" });(function() {var s = document.createElement(\"script\"), el = document.getElementsByTagName(\"script\")[0]; s.async = true;s.src = (document.location.protocol == \"https:\" ? \"https://sb\" : \"http://b\") + \".scorecardresearch.com/beacon.js\";el.parentNode.insertBefore(s, el);})();</script><noscript><img src=\"https://sb.scorecardresearch.com/p?c1=2&c2=15097263&cv=2.0&cj=1\" /></noscript>\r\n<!-- / comScore Tag -->","enabled":true},{"pages":["all"],"location":"footer","script":"<!--BEGIN QUALTRICS WEBSITE FEEDBACK SNIPPET-->\r\n<script type='text/javascript'>\r\n(function(){var g=function(e,h,f,g){\r\nthis.get=function(a){for(var a=a+\"=\",c=document.cookie.split(\";\"),b=0,e=c.length;b<e;b++){for(var d=c[b];\" \"==d.charAt(0);)d=d.substring(1,d.length);if(0==d.indexOf(a))return d.substring(a.length,d.length)}return null};\r\nthis.set=function(a,c){var b=\"\",b=new Date;b.setTime(b.getTime()+6048E5);b=\"; expires=\"+b.toGMTString();document.cookie=a+\"=\"+c+b+\"; path=/; \"};\r\nthis.check=function(){var a=this.get(f);if(a)a=a.split(\":\");else if(100!=e)\"v\"==h&&(e=Math.random()>=e/100?0:100),a=[h,e,0],this.set(f,a.join(\":\"));else return!0;var c=a[1];if(100==c)return!0;switch(a[0]){case \"v\":return!1;case \"r\":return c=a[2]%Math.floor(100/c),a[2]++,this.set(f,a.join(\":\")),!c}return!0};\r\nthis.go=function(){if(this.check()){var a=document.createElement(\"script\");a.type=\"text/javascript\";a.src=g;document.body&&document.body.appendChild(a)}};\r\nthis.start=function(){var t=this;\"complete\"!==document.readyState?window.addEventListener?window.addEventListener(\"load\",function(){t.go()},!1):window.attachEvent&&window.attachEvent(\"onload\",function(){t.go()}):t.go()};};\r\ntry{(new g(100,\"r\",\"QSI_S_ZN_5o5yqpvMVjgDOuN\",\"https://zn5o5yqpvmvjgdoun-wiley.siteintercept.qualtrics.com/SIE/?Q_ZID=ZN_5o5yqpvMVjgDOuN\")).start()}catch(i){}})();\r\n</script><div id='ZN_5o5yqpvMVjgDOuN'><!--DO NOT REMOVE-CONTENTS PLACED HERE--></div>\r\n<!--END WEBSITE FEEDBACK SNIPPET-->","enabled":false},{"pages":["all"],"location":"header","script":"<!-- Hotjar Tracking Code for http://www.dummies.com -->\r\n<script>\r\n (function(h,o,t,j,a,r){\r\n h.hj=h.hj||function(){(h.hj.q=h.hj.q||[]).push(arguments)};\r\n h._hjSettings={hjid:257151,hjsv:6};\r\n a=o.getElementsByTagName('head')[0];\r\n r=o.createElement('script');r.async=1;\r\n r.src=t+h._hjSettings.hjid+j+h._hjSettings.hjsv;\r\n a.appendChild(r);\r\n })(window,document,'https://static.hotjar.com/c/hotjar-','.js?sv=');\r\n</script>","enabled":false},{"pages":["article"],"location":"header","script":"<!-- //Connect Container: dummies --> <script src=\"//get.s-onetag.com/bffe21a1-6bb8-4928-9449-7beadb468dae/tag.min.js\" async defer></script>","enabled":true},{"pages":["homepage"],"location":"header","script":"<meta name=\"facebook-domain-verification\" content=\"irk8y0irxf718trg3uwwuexg6xpva0\" />","enabled":true},{"pages":["homepage","article","category","search"],"location":"footer","script":"<!-- Facebook Pixel Code -->\r\n<noscript>\r\n<img height=\"1\" width=\"1\" src=\"https://www.facebook.com/tr?id=256338321977984&ev=PageView&noscript=1\"/>\r\n</noscript>\r\n<!-- End Facebook Pixel Code -->","enabled":true}]}},"pageScriptsLoadedStatus":"success"},"searchState":{"searchList":[],"searchStatus":"initial","relatedArticlesList":[],"relatedArticlesStatus":"initial"},"routeState":{"name":"ArticleCategory","path":"/category/articles/data-science-33577/","hash":"","query":{},"params":{"category":"data-science-33577"},"fullPath":"/category/articles/data-science-33577/","meta":{"routeType":"category","breadcrumbInfo":{"suffix":"Articles","baseRoute":"/category/articles"},"prerenderWithAsyncData":true},"from":{"name":null,"path":"/","hash":"","query":{},"params":{},"fullPath":"/","meta":{}}},"sfmcState":{"newsletterSignupStatus":"initial"}}
Logo
  • Articles Open Article Categories
  • Books Open Book Categories
  • Collections Open Collections list
  • Custom Solutions
  • Home
  • Technology Articles
  • Information Technology Articles
  • Data Science Articles

Data Science Articles

Data science is what happens when you let the world's brightest minds loose on a big dataset. It gets crazy. Our articles will walk you through what data science is and what it does.

Browse By Category

Big Data

Databases

General (Data Science)

Web Analytics

Previous slideNext slide

Big Data

Databases

General (Data Science)

Web Analytics

Articles From Data Science

page 1
page 2
page 3
page 4
page 5
page 6
page 7
page 8
page 9
page 10
page 11
page 12
page 13
page 14
page 15
page 16
page 17
page 18
page 19
page 20
page 21
page 22
page 23
page 24
page 25
page 26
page 27
page 28
page 29
page 30
page 31
page 32
page 33
page 34
page 35
page 36
page 37

Filter Results

365 results
365 results
General (Data Science) Predictive Analytics For Dummies Cheat Sheet

Cheat Sheet / Updated 04-27-2022

A predictive analytics project combines execution of details with big-picture thinking. These handy tips and checklists will help keep your project on the rails and out of the woods.

View Cheat Sheet
General (Data Science) Data Science Programming All-in-One For Dummies Cheat Sheet

Cheat Sheet / Updated 04-25-2022

Data science affects many different technologies in a profound manner. Our society runs on data today, so you can’t do many things that aren’t affected by it in some way. Even the timing of stoplights depends on data collected by the highway department. Your food shopping experience depends on data collected from Point of Sale (POS) terminals, surveys, farming data, and sources you can’t even begin to imagine. No matter how you use data, this cheat sheet will help you use it more effectively.

View Cheat Sheet
General (Data Science) Data Science Strategy For Dummies Cheat Sheet

Cheat Sheet / Updated 04-15-2022

A revolutionary change is taking place in society and it involves data science. Everybody from small local companies to global enterprises is starting to realize the potential of data science and is seeing the value in digitizing their data assets and becoming data driven. Regardless of industry, companies have embarked on a similar journey to explore how to drive new business value by utilizing analytics, machine learning (ML), and artificial intelligence (AI) techniques and introducing data science as a new discipline. However, although utilizing these new technologies will help companies simplify their operations and drive down costs, nothing is simple about getting the strategic approach right for your data science investment. This cheat sheet gives you a peak at the fundamental concepts you need to be on top of when building your data science strategy. It looks not only at investing in a top performing data science team, but also what to consider in your data architecture and how to approach the commercial aspects of data science.

View Cheat Sheet
Big Data Big Data for Small Business For Dummies Cheat Sheet

Cheat Sheet / Updated 04-12-2022

Big data makes big headlines, but it’s much more than just a buzz phrase or the latest business fad. The phenomenon is very real and it’s producing concrete benefits in so many different areas – particularly in business. Here you will get to the heart of big data as a business owner or manager: You will take a look at the key terminology you need to understand the crucial big data skills for businesses, ten steps to using big data to make better decisions, and tips for communicating insights from data to your colleagues.

View Cheat Sheet
Databases Records Management For Dummies Cheat Sheet

Cheat Sheet / Updated 03-25-2022

Whether you’re a small business owner or work for a global corporation, you deal with information every day. You receive information, you send it, you determine what’s relevant, and you make decisions, whether consciously or subconsciously, about what information to retain. That’s why records management — managing the flood of information you get every day — should be such an important part of your business strategy. This Cheat Sheet can serve as a quick reference to some of the main aspects of record management.

View Cheat Sheet
Big Data Statistics for Big Data For Dummies Cheat Sheet

Cheat Sheet / Updated 03-10-2022

Summary statistical measures represent the key properties of a sample or population as a single numerical value. This has the advantage of providing important information in a very compact form. It also simplifies comparing multiple samples or populations. Summary statistical measures can be divided into three types: measures of central tendency, measures of central dispersion, and measures of association.

View Cheat Sheet
Databases SAS For Dummies Cheat Sheet

Cheat Sheet / Updated 03-01-2022

SAS Institute has hundreds of statistical analysis system products, so a partial list of the ones you might run will help you know which one to use for your job. The tasks in SAS Enterprise Guide and SAS Add-In for Microsoft Office create SAS programs that call on SAS procedures. Having a list of those procedures and being able to find them quickly in SAS Enterprise Guide will boost your efficiency.

View Cheat Sheet
General (Data Science) Microsoft Power BI For Dummies Cheat Sheet

Cheat Sheet / Updated 03-01-2022

Microsoft Power BI is an enterprise-class data analytics and business intelligence platform that users connect to for data analysis, visualization, collaboration, and distribution. The platform takes a unified, scalable approach to business intelligence that enables users to gain deeper data insights while using virtually any data source available. With Power BI, you can access tools to support the entire data analysis lifecycle — from importing to transformation to visualization and collaboration. Power BI, as part of the Microsoft Power Platform, is complementary to its sister application of Power Apps (for no-code/low-code application development), Power Automate (for workflow development), and Power Virtual Agents (for chatbots.). Each of these applications works well with one another. Strong integration opportunities also exist between Microsoft 365 (Word, Excel, PowerPoint, and SharePoint) and Dynamics 365. When you’re looking to realize the value of your data using Microsoft applications, or even third-party applications, Power BI can provide the insights you and your organization look for at speed and scale.

View Cheat Sheet
General (Data Science) Algorithms For Dummies Cheat Sheet

Cheat Sheet / Updated 02-28-2022

Algorithms are fun! Algorithms are beautiful! Algorithms are even better than your favorite pastime! Well, perhaps not the last one. In fact, algorithms surround you in many ways you might not have thought about, and you use them every day to perform important tasks. However, you need to be able to use algorithms in a way that doesn’t involve becoming a mathematician. Programming languages make it possible to describe the steps used to create an algorithm, and some languages are better than others at performing this task so that people can understand it without becoming a computer or data scientists. Python makes using algorithms easier because it comes with a lot of built-in and extended support (through the use of packages, datasets, and other resources). With that in mind, this Cheat Sheet helps you access the most commonly needed tips for making your use of algorithms fast and easy.

View Cheat Sheet
Databases Data Lakes For Dummies Cheat Sheet

Cheat Sheet / Updated 02-25-2022

A data lake is an enterprise-scale home for analytical data from all corners of your company or governmental agency. No matter what your analytical data landscape looks like today, your organization will benefit from building a data lake.

View Cheat Sheet
page 1
page 2
page 3
page 4
page 5
page 6
page 7
page 8
page 9
page 10
page 11
page 12
page 13
page 14
page 15
page 16
page 17
page 18
page 19
page 20
page 21
page 22
page 23
page 24
page 25
page 26
page 27
page 28
page 29
page 30
page 31
page 32
page 33
page 34
page 35
page 36
page 37

Quick Links

  • About For Dummies
  • Contact Us
  • Activate A Book Pin

Connect

Opt in to our newsletter!

By entering your email address and clicking the “Submit” button, you agree to the Terms of Use and Privacy Policy & to receive electronic communications from Dummies.com, which may include marketing promotions, news and updates.

About Dummies

Dummies has always stood for taking on complex concepts and making them easy to understand. Dummies helps everyone be more knowledgeable and confident in applying what they know. Whether it's to pass that big test, qualify for that big promotion or even master that cooking technique; people who rely on dummies, rely on it to learn the critical skills and relevant information necessary for success.

Terms of Use
Privacy Policy
Cookies Settings
Do Not Sell My Personal Info - CA Only