{"appState":{"pageLoadApiCallsStatus":true},"categoryState":{"relatedCategories":{"headers":{"timestamp":"2025-04-17T16:01:06+00:00"},"categoryId":33577,"data":{"title":"Data Science","slug":"data-science","image":{"src":null,"width":0,"height":0},"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577}],"parentCategory":{"categoryId":33572,"title":"Information Technology","slug":"information-technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"}},"childCategories":[{"categoryId":33578,"title":"Big Data","slug":"big-data","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33578"},"image":{"src":"/img/background-image-2.fabfbd5c.png","width":0,"height":0},"hasArticle":true,"hasBook":true,"articleCount":174,"bookCount":3},{"categoryId":33579,"title":"Databases","slug":"databases","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33579"},"image":{"src":"/img/background-image-1.daf74cf0.png","width":0,"height":0},"hasArticle":true,"hasBook":true,"articleCount":8,"bookCount":8},{"categoryId":33580,"title":"General Data Science","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"image":{"src":"/img/background-image-2.fabfbd5c.png","width":0,"height":0},"hasArticle":true,"hasBook":true,"articleCount":173,"bookCount":14},{"categoryId":34365,"title":"Web Analytics","slug":"web-analytics","_links":{"self":"https://dummies-api.dummies.com/v2/categories/34365"},"image":{"src":"/img/background-image-1.daf74cf0.png","width":0,"height":0},"hasArticle":true,"hasBook":true,"articleCount":14,"bookCount":1}],"description":"Data science is what happens when you let the world's brightest minds loose on a big dataset. It gets crazy. Our articles will walk you through what data science is and what it does.","relatedArticles":{"self":"https://dummies-api.dummies.com/v2/articles?category=33577&offset=0&size=5"},"hasArticle":true,"hasBook":true,"articleCount":369,"bookCount":26},"_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"}},"relatedCategoriesLoadedStatus":"success"},"listState":{"list":{"count":10,"total":369,"items":[{"headers":{"creationTime":"2020-02-18T20:45:52+00:00","modifiedTime":"2024-09-24T17:47:01+00:00","timestamp":"2024-09-24T18:01:08+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"Linear Regression vs. Logistic Regression","strippedTitle":"linear regression vs. logistic regression","slug":"linear-regression-vs-logistic-regression","canonicalUrl":"","seo":{"metaDescription":"Wondering how to differentiate between linear and logistic regression? Learn the difference here and see how it applies to data science.","noIndex":0,"noFollow":0},"content":"Both linear and logistic regression see a lot of use in <a href=\"https://www.dummies.com/article/technology/information-technology/data-science/general-data-science/data-science-programming-all-in-one-for-dummies-cheat-sheet-266847/\">data science</a> but are commonly used for different kinds of problems. You need to know and understand both types of regression to perform a full range of data science tasks.\r\n\r\nOf the two, logistic regression is harder to understand in many respects because it necessarily uses a more complex equation model. The following information gives you a basic overview of how linear and logistic regression differ.\r\n<h2 id=\"tab1\" >The equation model</h2>\r\nAny discussion of the difference between linear and logistic regression must start with the underlying equation model. The equation for linear regression is straightforward.\r\n<pre class=\"code\">y = a + bx</pre>\r\nYou may see this equation in other forms and you may see it called ordinary least squares regression, but the essential concept is always the same. Depending on the source you use, some of the equations used to express logistic regression can become downright terrifying unless you’re a math major. However, the start of this discussion can use one of the simplest views of logistic regression:\r\n<pre class=\"code\">p = f(a + bx)</pre>\r\n<code>&gt;p</code>, is equal to the logistic function, <span style=\"text-decoration: line-through;\">f</span>, applied to two model parameters, <code>a</code> and <code>b</code>, and one explanatory variable, <code>x</code>. When you look at this particular model, you see that it really isn’t all that different from the linear regression model, except that you now feed the result of the linear regression through the logistic function to obtain the required curve.\r\n\r\nThe output (dependent variable) is a probability ranging from 0 (not going to happen) to 1 (definitely will happen), or a categorization that says something is either part of the category or not part of the category. (You can also perform multiclass categorization, but focus on the binary response for now.) The best way to view the difference between linear regression output and logistic regression output is to say that the following:\r\n<ul>\r\n \t<li><strong>Linear regression is continuous.</strong> A continuous value can take any value within a specified interval (range) of values. For example, no matter how closely the height of two individuals matches, you can always find someone whose height fits between those two individuals. Examples of continuous values include:\r\n<ul>\r\n \t<li>Height</li>\r\n \t<li>Weight</li>\r\n \t<li>Waist size</li>\r\n</ul>\r\n</li>\r\n \t<li><strong>Logistic regression is discrete.</strong> A discrete value has specific values that it can assume. For example, a hospital can admit only a specific number of patients in a given day. You can’t admit half a patient (at least, not alive). Examples of discrete values include:\r\n<ul>\r\n \t<li>Number of people at the fair</li>\r\n \t<li>Number of jellybeans in the jar</li>\r\n \t<li>Colors of automobiles produced by a vendor</li>\r\n</ul>\r\n</li>\r\n</ul>\r\n<h2 id=\"tab2\" >The logistic function</h2>\r\nOf course, now you need to know about the logistic function. You can find a variety of forms of this function as well, but here’s the easiest one to understand:\r\n<pre class=\"code\">f(x) = e&lt;sup&gt;x&lt;/sup&gt; / e&lt;sup&gt;x&lt;/sup&gt; + 1</pre>\r\nYou already know about <code>f</code>, which is the logistic function, and <code>x</code> equals the algorithm you want to use, which is <code>a + bx </code>in this case. That leaves <code>e</code>, which is the natural logarithm and has an irrational value of 2.718, for the sake of discussion (<a href=\"https://www.intmath.com/exponential-logarithmic-functions/5-logs-base-e-ln.php\">check out a better approximation of the whole value</a>). Another way you see this function expressed is\r\n<pre class=\"code\">f(x) = 1 / (1 + e&lt;sup&gt;-x&lt;/sup&gt;)</pre>\r\nBoth forms are correct, but the first form is easier to use. Consider a simple problem in which <code>a</code>, the y-intercept, is 0, and <code>\"&gt;b</code>, the slope, is 1. The example uses <code>x</code> values from –6 to 6. Consequently, the first <code>f(x)</code> value would look like this when calculated (all values are rounded):\r\n<pre class=\"code\"> \r\n(1) e&lt;sup&gt;-6&lt;/sup&gt; / (1 + e&lt;sup&gt;-6&lt;/sup&gt;)\r\n(2) 0.00248 / 1 + 0.00248\r\n(3) 0.002474</pre>\r\nAs you might expect, an <code>x</code>value of 0 would result in an <code>f(x)</code> value of 0.5, and an <code>x</code> value of 6 would result in an <code>f(x)</code> value of 0.9975. Obviously, a linear regression would show different results for precisely the same <code>x</code> values. If you calculate and plot all the results from both logistic and linear regression using the following code, you receive a plot like the one below.\r\n<pre class=\"code\">import matplotlib.pyplot as plt\r\n%matplotlib inline\r\nfrom math import exp\r\n \r\nx_values = range(-6, 7)\r\nlin_values = [(0 + 1*x) / 13 for x in range(0, 13)]\r\nlog_values = [exp(0 + 1*x) / (1 + exp(0 + 1*x))\r\nfor x in x_values]\r\n \r\nplt.plot(x_values, lin_values, 'b-^')\r\nplt.plot(x_values, log_values, 'g-*')\r\nplt.legend(['Linear', 'Logistic'])\r\nplt.show()</pre>\r\n[caption id=\"attachment_268339\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-268339 size-full\" src=\"https://www.dummies.com/wp-content/uploads/data-science-programming-contrast-linear-logistic-regression.jpg\" alt=\"Contrasting linear to logistic regression\" width=\"556\" height=\"368\" /> Contrasting linear to logistic regression.[/caption]\r\n\r\nThis example relies on <a href=\"https://www.pythonforbeginners.com/basics/list-comprehensions-in-python\">list comprehension</a> to calculate the values because it makes the calculations clearer. The linear regression uses a different numeric range because you must normalize the values to appear in the 0 to 1 range for comparison. This is also why you divide the calculated values by 13. The <code>exp(x)</code> call used for the logistic regression raises <code>e</code> to the power of <code>x</code>, <code>e&lt;sup&gt;x&lt;/sup&gt;</code>, as needed for the logistic function.\r\n<p class=\"article-tips warning\">The model discussed here is simplified, and some math majors out there are probably throwing a temper tantrum of the most profound proportions right now. The Python or R package you use will actually take care of the math in the background, so really, what you need to know is how the math works at a basic level so that you can understand<a href=\"https://www.dummies.com/programming/python/view-python-package-documentation/\"> how to use the packages</a>. This section provides what you need to use the packages. However, if you insist on carrying out the calculations the old way, chalk to chalkboard, you’ll likely need a lot more information.</p>\r\n\r\n<h2 id=\"tab3\" >The problems that logistic regression solves</h2>\r\nYou can separate logistic regression into several categories. The first is simple logistic regression, in which you have one dependent variable and one independent variable, much as you see in simple linear regression. However, because of how you calculate the logistic regression, you can expect only two kinds of output:\r\n<ul>\r\n \t<li><strong>Classification:</strong> Decides between two available outcomes, such as male or female, yes or no, or high or low. The outcome is dependent on which side of the line a particular data point falls.</li>\r\n \t<li><strong>Probability:</strong> Determines the probability that something is true or false. The values true and false can have specific meanings. For example, you might want to know the probability that a particular apple will be yellow or red based on the presence of yellow and red apples in a bin.</li>\r\n</ul>\r\n<h2 id=\"tab4\" >Fit the curve</h2>\r\nAs part of understanding the difference between linear and logistic regression, consider this grade prediction problem, which lends itself well to linear regression. In the following code, you see the effect of trying to use logistic regression with that data:\r\n<pre class=\"code\">x1 = range(0,9)\r\ny1 = (0.25, 0.33, 0.41, 0.53, 0.59,\r\n0.70, 0.78, 0.86, 0.98)\r\nplt.scatter(x1, y1, c='r')\r\n \r\nlin_values = [0.242 + 0.0933*x for x in x1]\r\nlog_values = [exp(0.242 + .9033*x) /\r\n(1 + exp(0.242 + .9033*x))\r\nfor x in range(-4, 5)]\r\n \r\nplt.plot(x1, lin_values, 'b-^')\r\nplt.plot(x1, log_values, 'g-*')\r\nplt.legend(['Linear', 'Logistic', 'Org Data'])\r\nplt.show()</pre>\r\nThe example has undergone a few changes to make it easier to see precisely what is happening. It relies on the same data that was converted from questions answered correctly on the exam to a percentage. If you have 100 questions and you answer 25 of them correctly, you have answered 25 percent (0.25) of them correctly. The values are normalized to produce values between 0 and 1 percent.\r\n\r\n[caption id=\"attachment_268336\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-268336 size-full\" src=\"https://www.dummies.com/wp-content/uploads/data-science-programming-fitting-data.jpg\" alt=\"fitting the data for data science\" width=\"556\" height=\"365\" /> Considering the approach to fitting the data.[/caption]\r\n\r\nAs you can see from the image above, the linear regression follows the data points closely. The logistic regression doesn’t. However, logistic regression often is the correct choice when the data points naturally follow the logistic curve, which happens far more often than you might think. You must use the technique that fits your data best, which means using linear regression in this case.\r\n<h2 id=\"tab5\" >A pass/fail example</h2>\r\nAn essential point to remember is that logistic regression works best for probability and classification. Consider that points on an exam ultimately predict passing or failing the course. If you get a certain percentage of the answers correct, you pass, but you fail otherwise. The following code considers the same data used for the example above, but converts it to a pass/fail list. When a student gets at least 70 percent of the questions correct, success is assured.\r\n<pre class=\"code\">y2 = [0 if x &lt; 0.70 else 1 for x in y1]\r\nplt.scatter(x1, y2, c='r')\r\n \r\nlin_values = [0.242 + 0.0933*x for x in x1]\r\nlog_values = [exp(0.242 + .9033*x) /\r\n(1 + exp(0.242 + .9033*x))\r\nfor x in range(-4, 5)]\r\n \r\nplt.plot(x1, lin_values, 'b-^')\r\nplt.plot(x1, log_values, 'g-*')\r\nplt.legend(['Linear', 'Logistic', 'Org Data'])\r\nplt.show()</pre>\r\nThis is an example of how <a href=\"https://www.dummies.com/programming/big-data/data-science/using-the-python-ecosystem-for-data-science/\">you can use list comprehensions in Python</a> to obtain a required dataset or data transformation. The list comprehension for <code>y2</code> starts with the continuous data in <code>y1</code> and turns it into discrete data. Note that the example uses precisely the same equations as before. All that has changed is the manner in which you view the data, as you can see below.\r\n\r\n[caption id=\"attachment_268335\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-268335 size-full\" src=\"https://www.dummies.com/wp-content/uploads/data-science-programming-linear-vs-logistic-regression.jpg\" alt=\"linear vs logistic regression\" width=\"556\" height=\"363\" /> Contrasting linear to logistic regression.[/caption]\r\n\r\nBecause of the change in the data, linear regression is no longer the option to choose. Instead, you use logistic regression to fit the data. Take into account that this example really hasn’t done any sort of analysis to optimize the results. The logistic regression fits the data even better if you do so.","description":"Both linear and logistic regression see a lot of use in <a href=\"https://www.dummies.com/article/technology/information-technology/data-science/general-data-science/data-science-programming-all-in-one-for-dummies-cheat-sheet-266847/\">data science</a> but are commonly used for different kinds of problems. You need to know and understand both types of regression to perform a full range of data science tasks.\r\n\r\nOf the two, logistic regression is harder to understand in many respects because it necessarily uses a more complex equation model. The following information gives you a basic overview of how linear and logistic regression differ.\r\n<h2 id=\"tab1\" >The equation model</h2>\r\nAny discussion of the difference between linear and logistic regression must start with the underlying equation model. The equation for linear regression is straightforward.\r\n<pre class=\"code\">y = a + bx</pre>\r\nYou may see this equation in other forms and you may see it called ordinary least squares regression, but the essential concept is always the same. Depending on the source you use, some of the equations used to express logistic regression can become downright terrifying unless you’re a math major. However, the start of this discussion can use one of the simplest views of logistic regression:\r\n<pre class=\"code\">p = f(a + bx)</pre>\r\n<code>&gt;p</code>, is equal to the logistic function, <span style=\"text-decoration: line-through;\">f</span>, applied to two model parameters, <code>a</code> and <code>b</code>, and one explanatory variable, <code>x</code>. When you look at this particular model, you see that it really isn’t all that different from the linear regression model, except that you now feed the result of the linear regression through the logistic function to obtain the required curve.\r\n\r\nThe output (dependent variable) is a probability ranging from 0 (not going to happen) to 1 (definitely will happen), or a categorization that says something is either part of the category or not part of the category. (You can also perform multiclass categorization, but focus on the binary response for now.) The best way to view the difference between linear regression output and logistic regression output is to say that the following:\r\n<ul>\r\n \t<li><strong>Linear regression is continuous.</strong> A continuous value can take any value within a specified interval (range) of values. For example, no matter how closely the height of two individuals matches, you can always find someone whose height fits between those two individuals. Examples of continuous values include:\r\n<ul>\r\n \t<li>Height</li>\r\n \t<li>Weight</li>\r\n \t<li>Waist size</li>\r\n</ul>\r\n</li>\r\n \t<li><strong>Logistic regression is discrete.</strong> A discrete value has specific values that it can assume. For example, a hospital can admit only a specific number of patients in a given day. You can’t admit half a patient (at least, not alive). Examples of discrete values include:\r\n<ul>\r\n \t<li>Number of people at the fair</li>\r\n \t<li>Number of jellybeans in the jar</li>\r\n \t<li>Colors of automobiles produced by a vendor</li>\r\n</ul>\r\n</li>\r\n</ul>\r\n<h2 id=\"tab2\" >The logistic function</h2>\r\nOf course, now you need to know about the logistic function. You can find a variety of forms of this function as well, but here’s the easiest one to understand:\r\n<pre class=\"code\">f(x) = e&lt;sup&gt;x&lt;/sup&gt; / e&lt;sup&gt;x&lt;/sup&gt; + 1</pre>\r\nYou already know about <code>f</code>, which is the logistic function, and <code>x</code> equals the algorithm you want to use, which is <code>a + bx </code>in this case. That leaves <code>e</code>, which is the natural logarithm and has an irrational value of 2.718, for the sake of discussion (<a href=\"https://www.intmath.com/exponential-logarithmic-functions/5-logs-base-e-ln.php\">check out a better approximation of the whole value</a>). Another way you see this function expressed is\r\n<pre class=\"code\">f(x) = 1 / (1 + e&lt;sup&gt;-x&lt;/sup&gt;)</pre>\r\nBoth forms are correct, but the first form is easier to use. Consider a simple problem in which <code>a</code>, the y-intercept, is 0, and <code>\"&gt;b</code>, the slope, is 1. The example uses <code>x</code> values from –6 to 6. Consequently, the first <code>f(x)</code> value would look like this when calculated (all values are rounded):\r\n<pre class=\"code\"> \r\n(1) e&lt;sup&gt;-6&lt;/sup&gt; / (1 + e&lt;sup&gt;-6&lt;/sup&gt;)\r\n(2) 0.00248 / 1 + 0.00248\r\n(3) 0.002474</pre>\r\nAs you might expect, an <code>x</code>value of 0 would result in an <code>f(x)</code> value of 0.5, and an <code>x</code> value of 6 would result in an <code>f(x)</code> value of 0.9975. Obviously, a linear regression would show different results for precisely the same <code>x</code> values. If you calculate and plot all the results from both logistic and linear regression using the following code, you receive a plot like the one below.\r\n<pre class=\"code\">import matplotlib.pyplot as plt\r\n%matplotlib inline\r\nfrom math import exp\r\n \r\nx_values = range(-6, 7)\r\nlin_values = [(0 + 1*x) / 13 for x in range(0, 13)]\r\nlog_values = [exp(0 + 1*x) / (1 + exp(0 + 1*x))\r\nfor x in x_values]\r\n \r\nplt.plot(x_values, lin_values, 'b-^')\r\nplt.plot(x_values, log_values, 'g-*')\r\nplt.legend(['Linear', 'Logistic'])\r\nplt.show()</pre>\r\n[caption id=\"attachment_268339\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-268339 size-full\" src=\"https://www.dummies.com/wp-content/uploads/data-science-programming-contrast-linear-logistic-regression.jpg\" alt=\"Contrasting linear to logistic regression\" width=\"556\" height=\"368\" /> Contrasting linear to logistic regression.[/caption]\r\n\r\nThis example relies on <a href=\"https://www.pythonforbeginners.com/basics/list-comprehensions-in-python\">list comprehension</a> to calculate the values because it makes the calculations clearer. The linear regression uses a different numeric range because you must normalize the values to appear in the 0 to 1 range for comparison. This is also why you divide the calculated values by 13. The <code>exp(x)</code> call used for the logistic regression raises <code>e</code> to the power of <code>x</code>, <code>e&lt;sup&gt;x&lt;/sup&gt;</code>, as needed for the logistic function.\r\n<p class=\"article-tips warning\">The model discussed here is simplified, and some math majors out there are probably throwing a temper tantrum of the most profound proportions right now. The Python or R package you use will actually take care of the math in the background, so really, what you need to know is how the math works at a basic level so that you can understand<a href=\"https://www.dummies.com/programming/python/view-python-package-documentation/\"> how to use the packages</a>. This section provides what you need to use the packages. However, if you insist on carrying out the calculations the old way, chalk to chalkboard, you’ll likely need a lot more information.</p>\r\n\r\n<h2 id=\"tab3\" >The problems that logistic regression solves</h2>\r\nYou can separate logistic regression into several categories. The first is simple logistic regression, in which you have one dependent variable and one independent variable, much as you see in simple linear regression. However, because of how you calculate the logistic regression, you can expect only two kinds of output:\r\n<ul>\r\n \t<li><strong>Classification:</strong> Decides between two available outcomes, such as male or female, yes or no, or high or low. The outcome is dependent on which side of the line a particular data point falls.</li>\r\n \t<li><strong>Probability:</strong> Determines the probability that something is true or false. The values true and false can have specific meanings. For example, you might want to know the probability that a particular apple will be yellow or red based on the presence of yellow and red apples in a bin.</li>\r\n</ul>\r\n<h2 id=\"tab4\" >Fit the curve</h2>\r\nAs part of understanding the difference between linear and logistic regression, consider this grade prediction problem, which lends itself well to linear regression. In the following code, you see the effect of trying to use logistic regression with that data:\r\n<pre class=\"code\">x1 = range(0,9)\r\ny1 = (0.25, 0.33, 0.41, 0.53, 0.59,\r\n0.70, 0.78, 0.86, 0.98)\r\nplt.scatter(x1, y1, c='r')\r\n \r\nlin_values = [0.242 + 0.0933*x for x in x1]\r\nlog_values = [exp(0.242 + .9033*x) /\r\n(1 + exp(0.242 + .9033*x))\r\nfor x in range(-4, 5)]\r\n \r\nplt.plot(x1, lin_values, 'b-^')\r\nplt.plot(x1, log_values, 'g-*')\r\nplt.legend(['Linear', 'Logistic', 'Org Data'])\r\nplt.show()</pre>\r\nThe example has undergone a few changes to make it easier to see precisely what is happening. It relies on the same data that was converted from questions answered correctly on the exam to a percentage. If you have 100 questions and you answer 25 of them correctly, you have answered 25 percent (0.25) of them correctly. The values are normalized to produce values between 0 and 1 percent.\r\n\r\n[caption id=\"attachment_268336\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-268336 size-full\" src=\"https://www.dummies.com/wp-content/uploads/data-science-programming-fitting-data.jpg\" alt=\"fitting the data for data science\" width=\"556\" height=\"365\" /> Considering the approach to fitting the data.[/caption]\r\n\r\nAs you can see from the image above, the linear regression follows the data points closely. The logistic regression doesn’t. However, logistic regression often is the correct choice when the data points naturally follow the logistic curve, which happens far more often than you might think. You must use the technique that fits your data best, which means using linear regression in this case.\r\n<h2 id=\"tab5\" >A pass/fail example</h2>\r\nAn essential point to remember is that logistic regression works best for probability and classification. Consider that points on an exam ultimately predict passing or failing the course. If you get a certain percentage of the answers correct, you pass, but you fail otherwise. The following code considers the same data used for the example above, but converts it to a pass/fail list. When a student gets at least 70 percent of the questions correct, success is assured.\r\n<pre class=\"code\">y2 = [0 if x &lt; 0.70 else 1 for x in y1]\r\nplt.scatter(x1, y2, c='r')\r\n \r\nlin_values = [0.242 + 0.0933*x for x in x1]\r\nlog_values = [exp(0.242 + .9033*x) /\r\n(1 + exp(0.242 + .9033*x))\r\nfor x in range(-4, 5)]\r\n \r\nplt.plot(x1, lin_values, 'b-^')\r\nplt.plot(x1, log_values, 'g-*')\r\nplt.legend(['Linear', 'Logistic', 'Org Data'])\r\nplt.show()</pre>\r\nThis is an example of how <a href=\"https://www.dummies.com/programming/big-data/data-science/using-the-python-ecosystem-for-data-science/\">you can use list comprehensions in Python</a> to obtain a required dataset or data transformation. The list comprehension for <code>y2</code> starts with the continuous data in <code>y1</code> and turns it into discrete data. Note that the example uses precisely the same equations as before. All that has changed is the manner in which you view the data, as you can see below.\r\n\r\n[caption id=\"attachment_268335\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-268335 size-full\" src=\"https://www.dummies.com/wp-content/uploads/data-science-programming-linear-vs-logistic-regression.jpg\" alt=\"linear vs logistic regression\" width=\"556\" height=\"363\" /> Contrasting linear to logistic regression.[/caption]\r\n\r\nBecause of the change in the data, linear regression is no longer the option to choose. Instead, you use logistic regression to fit the data. Take into account that this example really hasn’t done any sort of analysis to optimize the results. The logistic regression fits the data even better if you do so.","blurb":"","authors":[{"authorId":9109,"name":"John Paul Mueller","slug":"john-paul-mueller","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9109"}},{"authorId":9110,"name":"Luca Massaron","slug":"luca-massaron","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9110"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General Data Science","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":[{"articleId":192609,"title":"How to Pray the Rosary: A Comprehensive Guide","slug":"how-to-pray-the-rosary","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/192609"}},{"articleId":208741,"title":"Kabbalah For Dummies Cheat Sheet","slug":"kabbalah-for-dummies-cheat-sheet","categoryList":["body-mind-spirit","religion-spirituality","kabbalah"],"_links":{"self":"/articles/208741"}},{"articleId":230957,"title":"Nikon D3400 For Dummies Cheat Sheet","slug":"nikon-d3400-dummies-cheat-sheet","categoryList":["home-auto-hobbies","photography"],"_links":{"self":"/articles/230957"}},{"articleId":235851,"title":"Praying the Rosary and Meditating on the Mysteries","slug":"praying-rosary-meditating-mysteries","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/235851"}},{"articleId":284787,"title":"What Your Society Says About You","slug":"what-your-society-says-about-you","categoryList":["academics-the-arts","humanities"],"_links":{"self":"/articles/284787"}}],"inThisArticle":[{"label":"The equation model","target":"#tab1"},{"label":"The logistic function","target":"#tab2"},{"label":"The problems that logistic regression solves","target":"#tab3"},{"label":"Fit the curve","target":"#tab4"},{"label":"A pass/fail example","target":"#tab5"}],"relatedArticles":{"fromBook":[{"articleId":268303,"title":"How Data is Collected and Why It Can Be Problematic","slug":"how-data-is-collected-and-why-it-can-be-problematic","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/268303"}},{"articleId":268298,"title":"How to Perform Pattern Matching in Python","slug":"how-to-perform-pattern-matching-in-python","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/268298"}},{"articleId":268293,"title":"How Pattern Matching Works in Data Science","slug":"how-pattern-matching-works-in-data-science","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/268293"}},{"articleId":268288,"title":"The Need for Reliable Sources in Data Science Applications","slug":"the-need-for-reliable-sources-in-data-science-applications","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/268288"}},{"articleId":268285,"title":"The Basics of Deep Learning Framework Usage and Low-End Framework Options","slug":"the-basics-of-deep-learning-framework-usage-and-low-end-framework-options","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/268285"}}],"fromCategory":[{"articleId":301769,"title":"Data Analytics & Visualization All-in-One Cheat Sheet","slug":"data-analytics-visualization-all-in-one-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/301769"}},{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281678,"slug":"data-science-programming-all-in-one-for-dummies-2","isbn":"9781119626114","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119626110/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119626110/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119626110-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119626110/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119626110/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/data-science-programming-all-in-one-for-dummies-cover-9781119626114-203x255.jpg","width":203,"height":255},"title":"Data Science Programming All-in-One For Dummies","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"<p><p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, <b data-author-id=\"9109\">John Paul Mueller</b>, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, <b data-author-id=\"9110\">Luca Massaron</b>, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b></p>","authors":[{"authorId":9109,"name":"John Paul Mueller","slug":"john-paul-mueller","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9109"}},{"authorId":9110,"name":"Luca Massaron","slug":"luca-massaron","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9110"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119626114&quot;]}]\" id=\"du-slot-66f2fe651eb5d\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119626114&quot;]}]\" id=\"du-slot-66f2fe6520a09\"></div></div>"},"articleType":{"articleType":"Articles","articleList":null,"content":null,"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2024-09-24T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":268328},{"headers":{"creationTime":"2024-04-12T14:01:05+00:00","modifiedTime":"2024-04-12T14:01:05+00:00","timestamp":"2024-04-12T15:01:11+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"Data Analytics & Visualization All-in-One Cheat Sheet","strippedTitle":"data analytics & visualization all-in-one cheat sheet","slug":"data-analytics-visualization-all-in-one-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"Boost your data analytics and visualization with our all-in-one cheat sheet. Learn about essential tools like Microsoft Power BI, Tableau, SQL, and more.","noIndex":0,"noFollow":0},"content":"A wide range of tools is available that are designed to help big businesses and small take advantage of the data science revolution. Among the most essential of these tools are Microsoft Power BI, Tableau, SQL, and the R and Python programming languages.","description":"A wide range of tools is available that are designed to help big businesses and small take advantage of the data science revolution. Among the most essential of these tools are Microsoft Power BI, Tableau, SQL, and the R and Python programming languages.","blurb":"","authors":[{"authorId":34674,"name":"Jack Hyman","slug":"jack-hyman","description":"<b>Jack Hyman</b> is chief executive officer of HyerTek, an IT consulting firm specializing in Microsoft’s business platforms. He is associate professor in the Computer Information Sciences department at the University of the Cumberlands. He has written several books in the <i>For Dummies</i> series, as well as certification study guides for the Microsoft Azure technology.","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/34674"}},{"authorId":9110,"name":"Luca Massaron","slug":"luca-massaron","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9110"}},{"authorId":11290,"name":"Paul McFedries","slug":"paul-mcfedries","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/11290"}},{"authorId":9109,"name":"John Paul Mueller","slug":"john-paul-mueller","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9109"}},{"authorId":9232,"name":"Lillian Pierson","slug":"lillian-pierson","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9232"}},{"authorId":33378,"name":"Jonathan Reichental","slug":"jonathan-reichental","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/33378"}},{"authorId":9759,"name":"Joseph Schmuller","slug":"joseph-schmuller","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9759"}},{"authorId":10511,"name":"Alan R. Simon","slug":"alan-r-simon","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/10511"}},{"authorId":9559,"name":"Allen G. Taylor","slug":"allen-g-taylor","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9559"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General Data Science","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":[{"articleId":192609,"title":"How to Pray the Rosary: A Comprehensive Guide","slug":"how-to-pray-the-rosary","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/192609"}},{"articleId":208741,"title":"Kabbalah For Dummies Cheat Sheet","slug":"kabbalah-for-dummies-cheat-sheet","categoryList":["body-mind-spirit","religion-spirituality","kabbalah"],"_links":{"self":"/articles/208741"}},{"articleId":230957,"title":"Nikon D3400 For Dummies Cheat Sheet","slug":"nikon-d3400-dummies-cheat-sheet","categoryList":["home-auto-hobbies","photography"],"_links":{"self":"/articles/230957"}},{"articleId":235851,"title":"Praying the Rosary and Meditating on the Mysteries","slug":"praying-rosary-meditating-mysteries","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/235851"}},{"articleId":284787,"title":"What Your Society Says About You","slug":"what-your-society-says-about-you","categoryList":["academics-the-arts","humanities"],"_links":{"self":"/articles/284787"}}],"inThisArticle":[],"relatedArticles":{"fromBook":[],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":301685,"slug":"data-analytics-visualization-all-in-one-for-dummies","isbn":"9781394244096","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1394244096/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1394244096/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1394244096-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1394244096/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1394244096/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/data-analytics-and-visualization-for-dummies-cover-9781394244096-203x255.jpg","width":203,"height":255},"title":"Data Analytics & Visualization All-in-One For Dummies","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"<p><b><b data-author-id=\"34674\">Jack Hyman</b></b> is chief executive officer of HyerTek, an IT consulting firm specializing in Microsoft’s business platforms. He is associate professor in the Computer Information Sciences department at the University of the Cumberlands. He has written several books in the <i>For Dummies</i> series, as well as certification study guides for the Microsoft Azure technology. <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, <b data-author-id=\"9110\">Luca Massaron</b>, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, <b data-author-id=\"11290\">Paul McFedries</b>, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, <b data-author-id=\"9109\">John Paul Mueller</b>, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, <b data-author-id=\"9232\">Lillian Pierson</b>, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, <b data-author-id=\"33378\">Jonathan Reichental</b> PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, <b data-author-id=\"9759\">Joseph Schmuller</b> PhD, Alan Simon, and Allen G. Taylor.</b> <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and <b data-author-id=\"9559\">Allen G. Taylor</b>.</b></p>","authors":[{"authorId":34674,"name":"Jack Hyman","slug":"jack-hyman","description":"<b>Jack Hyman</b> is chief executive officer of HyerTek, an IT consulting firm specializing in Microsoft’s business platforms. He is associate professor in the Computer Information Sciences department at the University of the Cumberlands. He has written several books in the <i>For Dummies</i> series, as well as certification study guides for the Microsoft Azure technology.","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/34674"}},{"authorId":9110,"name":"Luca Massaron","slug":"luca-massaron","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9110"}},{"authorId":11290,"name":"Paul McFedries","slug":"paul-mcfedries","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/11290"}},{"authorId":9109,"name":"John Paul Mueller","slug":"john-paul-mueller","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9109"}},{"authorId":9232,"name":"Lillian Pierson","slug":"lillian-pierson","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9232"}},{"authorId":33378,"name":"Jonathan Reichental","slug":"jonathan-reichental","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/33378"}},{"authorId":9759,"name":"Joseph Schmuller","slug":"joseph-schmuller","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9759"}},{"authorId":10511,"name":"Alan R. Simon","slug":"alan-r-simon","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/10511"}},{"authorId":9559,"name":"Allen G. Taylor","slug":"allen-g-taylor","description":" <p> <b>This All-in-One draws on the work of top authors in the <i>For Dummies </i>series who’ve created books designed to help data professionals do their work. The experts are Jack Hyman, Luca Massaron, Paul McFedries, John Paul Mueller, Lillian Pierson, Jonathan Reichental PhD, Joseph Schmuller PhD, Alan Simon, and Allen G. Taylor.</b> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9559"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781394244096&quot;]}]\" id=\"du-slot-66194cb7b2fe3\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781394244096&quot;]}]\" id=\"du-slot-66194cb7b4f9b\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":0,"title":"","slug":null,"categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/"}}],"content":[{"title":"Comparing Microsoft Power BI and Excel","thumb":null,"image":null,"content":"<p>Microsoft markets Power BI as a way to connect and visualize data using a unified, scalable platform that offers self-service and enterprise business intelligence that can help you gain deep insights into data. You may already have Excel, which will perform many of the functions you need, and wonder if upgrading to Power BI is worth the effort.</p>\n<p>Here are some advantages of upgrading to Power BI:</p>\n<ul>\n<li>Power BI supplies an array of high-level analytics offerings that Excel doesn’t include, such as the ability to create dashboards, key performance indicators (KPI), visualizations, and alerts.</li>\n<li>Power BI has significant collaboration capabilities, whereas Excel has limited data collaboration options.</li>\n<li>Though Excel can help when it comes to creating advanced reports, if you want to build data models that include predictive and machine learning assets, you have to turn to specific versions of Power BI.</li>\n<li>There is no single free version of Excel. On the other hand, you can start with Power BI for free. You can also purchase premium alternatives if you need advanced features — from a few dollars per month to several thousand.</li>\n<li>Power BI integrates business intelligence (BI) and data visualization so that users can create custom and interactive dashboards, KPIs, and reports. Microsoft Excel is limited in handling data analytics, mathematical operations, or data organization using a spreadsheet.</li>\n<li>Power BI can extract and format data from more than a single data source type. Because Power BI handles extensive data ingestion — the uploading of data from an external source, in other words —the process is, by nature, much faster.</li>\n<li>Because Power BI can connect with various data sources, the range of outputs, including dashboards and reports, is more interactive, whereas Excel is limited in scope. Above all, Power BI is a tool for data visualization and analysis that allows for collaboration. Excel limits sharing and data analysis to a limited number of end users.</li>\n</ul>\n"},{"title":"Engaging Tableau users based on user type","thumb":null,"image":null,"content":"<p>Tableau is a tool used for Enterprise BI but heavily leveraged in communities where data is regulated such as banking, healthcare, insurance, and government. Users access, author, prepare, interact, collaborate, and govern their data across Tableau Desktop, Tableau Prep, and Tableau Cloud based on their user type. Following is a handy “quick reference” for those times when you need to know a Tableau user’s limitations based on their user type.</p>\n<h3>Access</h3>\n<p>Tableau recognizes the following two discriminating access types:</p>\n<p><strong>Key access capabilities</strong></p>\n<table>\n<tbody>\n<tr>\n<td></td>\n<td>Creator</td>\n<td>Explorer</td>\n<td>Viewer</td>\n</tr>\n<tr>\n<td>Web and mobile</td>\n<td>✓</td>\n<td>✓</td>\n<td>✓</td>\n</tr>\n<tr>\n<td>Embedded content</td>\n<td>✓</td>\n<td>✓</td>\n<td>✓</td>\n</tr>\n</tbody>\n</table>\n<h3>Author</h3>\n<p>Authors in Tableau leverage the Tableau platform to make decisions by digging into the available data sources to create visualizations for themselves or manage those for others in a power-user capacity, as noted in the following table.</p>\n<p><strong>Key author capabilities offered in tableau</strong></p>\n<table>\n<thead>\n<tr>\n<td></td>\n<td>Creator</td>\n<td>Explorer</td>\n<td>Viewer</td>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>Edit existing workbooks and visualizations</td>\n<td>✓</td>\n<td>✓</td>\n<td></td>\n</tr>\n<tr>\n<td>Create and publish new workbooks from existing published data sources only</td>\n<td>✓</td>\n<td>✓</td>\n<td></td>\n</tr>\n<tr>\n<td>Explore existing published data sources with Ask Data, a natural language engine for analytics analysis</td>\n<td>✓</td>\n<td>✓</td>\n<td></td>\n</tr>\n<tr>\n<td>Create and publish new workbooks with one or more new data sources</td>\n<td>✓</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td>Create and publish new data sources</td>\n<td>✓</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td>Create new workbooks based on Dashboard Starters, a way to integrate with other enterprise software applications such as Salesforce CRM or SAP ERP (Tableau Cloud only)</td>\n<td>✓</td>\n<td></td>\n<td></td>\n</tr>\n</tbody>\n</table>\n<h3>Prepare</h3>\n<p>Data preparation is one area that stands out for those requiring the development functionality found in Tableau Desktop. Unless you are merely the orchestrator of data, which includes scheduling the data for dissemination, all data preparation actions fall under the Creator user type. The following table summarizes key capabilities for data preparation offered in Tableau.</p>\n<p><strong>Key preparation capabilities offered In Tableau</strong></p>\n<table>\n<thead>\n<tr>\n<td></td>\n<td>Creator</td>\n<td>Explorer</td>\n<td>Viewer</td>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>Create new data flow files (.tfl) or .hypher file</td>\n<td> ✓</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td>Edit and modify data flow files</td>\n<td> ✓</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td>Export data files (.tde, .hyper, .csv)</td>\n<td> ✓</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td>Publish and run flows</td>\n<td> ✓</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td>Schedule flows</td>\n<td> ✓</td>\n<td> ✓</td>\n<td></td>\n</tr>\n</tbody>\n</table>\n<h3>Interact</h3>\n<p>Interaction is a big part of the sales pitch with the Tableau brand, so it&#8217;s not surprising that all license types include a bevy of interaction options. The noticeable difference is the ability to download summaries as opposed to full data, as shown in the following table. If you need to dig into the weeds on a data source, the Creator license is non-negotiable.</p>\n<p><strong>Key interaction capabilities offered in Tableau</strong></p>\n<table>\n<thead>\n<tr>\n<td></td>\n<td>Creator</td>\n<td>Explorer</td>\n<td>Viewer</td>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>Interact with data using a variety of visualization types</td>\n<td> ✓</td>\n<td> ✓</td>\n<td> ✓</td>\n</tr>\n<tr>\n<td>Create and share views</td>\n<td> ✓</td>\n<td> ✓</td>\n<td> ✓</td>\n</tr>\n<tr>\n<td>Download visualizations as static images (.pdf, .png, .jpg)</td>\n<td> ✓</td>\n<td> ✓</td>\n<td> ✓</td>\n</tr>\n<tr>\n<td>Download summary data</td>\n<td> ✓</td>\n<td> ✓</td>\n<td> ✓</td>\n</tr>\n<tr>\n<td>Download full data</td>\n<td> ✓</td>\n<td> ✓</td>\n<td></td>\n</tr>\n</tbody>\n</table>\n<h3>Collaborate</h3>\n<p>Except for allowing one or more parties to share, a Viewer has all the same collaboration features as a Creator and Explorer. As shown in the following table, the various collaboration features enable subscriptions and alerts for others as part of the programmatic process, which a developer or power user often completes.</p>\n<p><strong>Key collaboration capabilities offered in Tableau</strong></p>\n<table>\n<tbody>\n<tr>\n<td></td>\n<td>Creator</td>\n<td>Explorer</td>\n<td>Viewer</td>\n</tr>\n<tr>\n<td>Comment on any visualization, including dashboards, reports, KPIs, and stories</td>\n<td> ✓</td>\n<td> ✓</td>\n<td> ✓</td>\n</tr>\n<tr>\n<td>Create subscriptions for yourself</td>\n<td> ✓</td>\n<td> ✓</td>\n<td> ✓</td>\n</tr>\n<tr>\n<td>Receive alert notifications</td>\n<td> ✓</td>\n<td> ✓</td>\n<td> ✓</td>\n</tr>\n<tr>\n<td>Create subscriptions for others</td>\n<td> ✓</td>\n<td> ✓</td>\n<td></td>\n</tr>\n<tr>\n<td>Create alert notifications for others</td>\n<td> ✓</td>\n<td> ✓</td>\n<td></td>\n</tr>\n</tbody>\n</table>\n<h3>Govern</h3>\n<p><em>Govern</em> is the fancy term for system administration. Viewers have no administrative capabilities, whereas an Explorer, the “power user,” can limit user access. But when it comes to managing enterprise security for data sources and integrating with security tenants, a way to isolate privileged and secure organizational data using an identity management platform such as Microsoft Azure Directory, you must be a Creator, as noted in the following table.</p>\n<p><strong>Key governance capabilities offered in Tableau</strong></p>\n<table>\n<thead>\n<tr>\n<td></td>\n<td>Creator</td>\n<td>Explorer</td>\n<td>Viewer</td>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>Manage users and permissions</td>\n<td> ✓</td>\n<td> ✓</td>\n<td></td>\n</tr>\n<tr>\n<td>Manage content and certify data sources</td>\n<td> ✓</td>\n<td> ✓</td>\n<td></td>\n</tr>\n<tr>\n<td>Perform server administration</td>\n<td> ✓</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td>Conduct fine-grained security management</td>\n<td> ✓</td>\n<td></td>\n<td></td>\n</tr>\n</tbody>\n</table>\n"},{"title":"SQL data types","thumb":null,"image":null,"content":"<p>SQL is a querying language that is used with proprietary and open-source data analytics and visualization platforms. The following table summarizes commonly used SQL data types and gives an example of each.</p>\n<h3>SQL data types</h3>\n<table>\n<tbody>\n<tr>\n<td><strong><em>Data Type</em></strong></td>\n<td><strong><em>Example Value</em></strong></td>\n</tr>\n<tr>\n<td><code>CHARACTER (20)</code></td>\n<td><code>'Amateur Radio'</code></td>\n</tr>\n<tr>\n<td><code>VARCHAR (20)</code></td>\n<td><code>'Amateur Radio'</code></td>\n</tr>\n<tr>\n<td><code>CLOB (1000000)</code></td>\n<td><code>'This character string is a million characters long … '</code></td>\n</tr>\n<tr>\n<td><code>SMALLINT, BIGINT, or INTEGER</code></td>\n<td><code>7500</code></td>\n</tr>\n<tr>\n<td><code>NUMERIC or DECIMAL</code></td>\n<td><code>3425.432</code></td>\n</tr>\n<tr>\n<td><code>REAL, FLOAT, or DOUBLE PRECISION</code></td>\n<td><code>6.626E-34</code></td>\n</tr>\n<tr>\n<td><code>BINARY</code></td>\n<td><code>'1011001110101010'</code></td>\n</tr>\n<tr>\n<td><code>BINARY VARYING</code></td>\n<td><code>'10110'</code></td>\n</tr>\n<tr>\n<td><code>BLOB (1000000)</code></td>\n<td><code>'1001001110101011010101010101… '</code></td>\n</tr>\n<tr>\n<td><code>BOOLEAN</code></td>\n<td><code>'true'</code></td>\n</tr>\n<tr>\n<td><code>DATE</code></td>\n<td><code>1957-08-14</code></td>\n</tr>\n<tr>\n<td><code>TIME WITHOUT TIME ZONE (2)<sup>1</sup></code></td>\n<td><code>12:46:02.43</code></td>\n</tr>\n<tr>\n<td><code>TIME WITH TIME ZONE (3)</code></td>\n<td><code>12:46:02.432-08:00</code></td>\n</tr>\n<tr>\n<td><code>TIMESTAMP WITHOUT TIME ZONE (0)</code></td>\n<td><code>1957-08-14 12:46:02</code></td>\n</tr>\n<tr>\n<td><code>TIMESTAMP WITH TIME ZONE (0)</code></td>\n<td><code>1957-08-14 12:46:02-08:00</code></td>\n</tr>\n<tr>\n<td><code>INTERVAL DAY</code></td>\n<td><code>INTERVAL '4' DAY</code></td>\n</tr>\n<tr>\n<td><code>ROW</code></td>\n<td><code>ROW (Street VARCHAR (25), City VARCHAR (20), State CHAR (2), PostalCode VARCHAR (9))</code></td>\n</tr>\n<tr>\n<td><code>ARRAY</code></td>\n<td><code>INTEGER ARRAY [15]</code></td>\n</tr>\n<tr>\n<td><code>MULTISET</code></td>\n<td><code>Phone VARCHAR (15) MULTISET [4]</code></td>\n</tr>\n<tr>\n<td><code>REF</code></td>\n<td><code>Not an ordinary type, but a pointer to a referenced type</code></td>\n</tr>\n<tr>\n<td><code>USER DEFINED TYPE</code></td>\n<td><code>Currency type based on DECIMAL</code></td>\n</tr>\n</tbody>\n</table>\n<p><em><sup>1</sup></em><em>Argument specifies number of fractional digits.</em></p>\n"},{"title":"R statistical functions","thumb":null,"image":null,"content":"<p>R is an open-source programming language that can be configured for use with Power BI and Tableau, but is more commonly used with open-source (free) platforms like Jupyter Notebook and Anaconda to conceive data analytics outputs and visualizations. Unlike Power BI and Tableau, open-source tools leveraging programming languages are used in academic settings or by analysts requiring technologies that are data intensive.</p>\n<p>Here’s a selection of statistical functions that come with the standard R installation. You’ll find many others in R packages.</p>\n<h3><strong>Central</strong> t<strong>endency</strong> <strong>and</strong> variability</h3>\n<table width=\"564\">\n<tbody>\n<tr>\n<td width=\"139\"><strong>Function</strong></td>\n<td width=\"436\"><strong>What it calculates</strong></td>\n</tr>\n<tr>\n<td width=\"139\">mean(x)</td>\n<td width=\"436\">Mean of the numbers in vector x</td>\n</tr>\n<tr>\n<td width=\"139\">median(x)</td>\n<td width=\"436\">Median of the numbers in vector x</td>\n</tr>\n<tr>\n<td width=\"139\">var(x)</td>\n<td width=\"436\">Estimated variance of the population from which the numbers in vector x are sampled</td>\n</tr>\n<tr>\n<td width=\"139\">sd(x)</td>\n<td width=\"436\">Estimated standard deviation of the population from which the numbers in vector x are sampled</td>\n</tr>\n<tr>\n<td width=\"139\">scale(x)</td>\n<td width=\"436\">Standard scores (<em>z-</em>scores) for the numbers in vector x</td>\n</tr>\n</tbody>\n</table>\n<h3><strong>Relative</strong> s<strong>tanding</strong></h3>\n<table width=\"564\">\n<tbody>\n<tr>\n<td colspan=\"2\" width=\"564\"></td>\n</tr>\n<tr>\n<td width=\"271\"><strong>Function</strong></td>\n<td width=\"293\"><strong>What it calculates</strong></td>\n</tr>\n<tr>\n<td width=\"271\">sort(x)</td>\n<td width=\"293\">The numbers in vector x in increasing order</td>\n</tr>\n<tr>\n<td width=\"271\">sort(x)[n]</td>\n<td width=\"293\">The <em>n</em>th smallest number in vector x</td>\n</tr>\n<tr>\n<td width=\"271\">rank(x)</td>\n<td width=\"293\">Ranks of the numbers (in increasing order) in vector x</td>\n</tr>\n<tr>\n<td width=\"271\">rank(-x)</td>\n<td width=\"293\">Ranks of the numbers (in decreasing order) in vector x</td>\n</tr>\n<tr>\n<td width=\"271\">rank(x, ties.method= “average”)</td>\n<td width=\"293\">Ranks of the numbers (in increasing order) in vector x, with tied numbers given the average of the ranks that the ties would have attained</td>\n</tr>\n<tr>\n<td width=\"271\">rank(x, ties.method=  “min”)</td>\n<td width=\"293\">Ranks of the numbers (in increasing order) in vector x, with tied numbers given the minimum of the ranks that the ties would have attained</td>\n</tr>\n<tr>\n<td width=\"271\">rank(x, ties.method = “max”)</td>\n<td width=\"293\">Ranks of the numbers (in increasing order) in vector x, with tied numbers given the maximum of the ranks that the ties would have attained</td>\n</tr>\n<tr>\n<td width=\"271\">quantile(x)</td>\n<td width=\"293\">The 0th, 25th, 50th, 75th, and 100th percentiles (the <em>quartiles, </em>in other words) of the numbers in vector x. (That’s not a misprint: quantile(x) returns the quartiles of x.)</td>\n</tr>\n</tbody>\n</table>\n<h3><em>t-</em>tests</h3>\n<table width=\"564\">\n<tbody>\n<tr>\n<td></td>\n<td width=\"386\"></td>\n</tr>\n<tr>\n<td width=\"190\"><strong>Function</strong></td>\n<td width=\"386\"><strong>What it calculates</strong></td>\n</tr>\n<tr>\n<td width=\"190\">t.test(x,mu=n, alternative = “two.sided”)</td>\n<td width=\"386\">Two-tailed <em>t-</em>test that the mean of the numbers in vector <em>x </em>is different from <em>n</em>.</td>\n</tr>\n<tr>\n<td width=\"190\">t.test(x,mu=n, alternative = “greater”)</td>\n<td width=\"386\">One-tailed <em>t-</em>test that the mean of the numbers in vector <em>x</em> is greater than <em>n</em>.</td>\n</tr>\n<tr>\n<td width=\"190\">t.test(x,mu=n, alternative = “less”)</td>\n<td width=\"386\">One-tailed <em>t-</em>test that the mean of the numbers in vector <em>x</em> is less than <em>n</em>.</td>\n</tr>\n<tr>\n<td width=\"190\">t.test(x,y,mu=0, var.equal  = TRUE, alternative = “two.sided”)</td>\n<td width=\"386\">Two-tailed <em>t-</em>test that the mean of the numbers in vector <em>x</em> is different from the mean of the numbers in vector <em>y</em>. The variances in the two vectors are assumed to be equal.</td>\n</tr>\n<tr>\n<td width=\"190\">t.test(x,y,mu=0, alternative = “two.sided”, paired  = TRUE)</td>\n<td width=\"386\">Two-tailed <em>t-</em>test that the mean of the numbers in vector <em>x</em> is different from the mean of the numbers in vector <em>y</em>. The vectors represent matched samples.</td>\n</tr>\n</tbody>\n</table>\n<h3>Analysis of variance (ANOVA)</h3>\n<table width=\"564\">\n<tbody>\n<tr>\n<td width=\"104\"><strong>Function</strong></td>\n<td width=\"468\"><strong>What it calculates</strong></td>\n</tr>\n<tr>\n<td width=\"104\">aov(y~x, data = d)</td>\n<td width=\"468\">Single-factor ANOVA, with the numbers in vector <em>y</em> as the dependent variable and the elements of vector <em>x </em>as the levels of the independent variable. The data are in data frame <em>d</em>.</td>\n</tr>\n<tr>\n<td width=\"104\">aov(y~x + Error(w/x), data = d)</td>\n<td width=\"468\">Repeated Measures ANOVA, with the numbers in vector <em>y </em>as the dependent variable and the elements in vector <em>x </em>as the levels of an independent variable. Error(w/x) indicates that each element in vector <em>w</em> experiences all the levels of <em>x</em>. (In other words, <em>x</em> is a repeated measure.) The data are in data frame <em>d</em>.</td>\n</tr>\n<tr>\n<td width=\"104\">aov(y~x*z, data = d)</td>\n<td width=\"468\">Two-factor ANOVA, with the numbers in vector<em> y</em> as the dependent variable and the elements of vectors <em>x </em>and <em>z</em> as the levels of the two independent variables. The data are in data frame <em>d</em>.</td>\n</tr>\n<tr>\n<td width=\"104\">aov(y~x*z + Error(w/z), data = d)</td>\n<td width=\"468\">Mixed ANOVA, with the numbers in vector <em>z</em> as the dependent variable and the elements of vectors <em>x</em> and <em>y</em> as the levels of the two independent variables. Error(w/z) indicates that each element in vector <em>w</em> experiences all the levels of <em>z</em>. (In other words, <em>z</em> is a repeated measure.) The data are in data frame <em>d</em>.</td>\n</tr>\n</tbody>\n</table>\n<h3>Correlation and regression</h3>\n<table width=\"564\">\n<tbody>\n<tr>\n<td width=\"106\"><strong>Function</strong></td>\n<td width=\"466\"><strong>What it calculates</strong></td>\n</tr>\n<tr>\n<td width=\"106\">cor(x,y)</td>\n<td width=\"466\">Correlation coefficient between the numbers in vector <em>x</em> and the numbers in vector <em>y</em></td>\n</tr>\n<tr>\n<td width=\"106\">cor.test(x,y)</td>\n<td width=\"466\">Correlation coefficient between the numbers in vector <em>x </em>and the numbers in vector <em>y</em>, along with a <em>t-</em>test of the significance of the correlation coefficient.</td>\n</tr>\n<tr>\n<td width=\"106\">lm(y~x, data = d)</td>\n<td width=\"466\">Linear regression analysis with the numbers in vector <em>y</em> as the dependent variable and the numbers in vector <em>x </em>as the independent variable. Data are in data frame <em>d</em>.</td>\n</tr>\n<tr>\n<td width=\"106\">coefficients(a)</td>\n<td width=\"466\">Slope and intercept of linear regression model <em>a.</em></td>\n</tr>\n<tr>\n<td width=\"106\">confint(a)</td>\n<td width=\"466\">Confidence intervals of the slope and intercept of linear regression model <em>a</em>.</td>\n</tr>\n<tr>\n<td width=\"106\">lm(y~x+z, data = d)</td>\n<td width=\"466\">Multiple regression analysis with the numbers in vector <em>y </em>as the dependent variable and the numbers in vectors<em> x</em> and<em> z</em> as the independent variables. Data are in data frame <em>d</em>.</td>\n</tr>\n</tbody>\n</table>\n<p class=\"article-tips tip\">When you carry out an ANOVA or a regression analysis, store the analysis in a list — for example: a &lt;- lm(y~x, data = d). Then, to see the tabled results, use the summary() function: summary(a)</p>\n"},{"title":"Python line plot styles","thumb":null,"image":null,"content":"<p>Like R, Python is an open-source programming language that can be configured for use with Power BI and Tableau, but is more commonly used with open-source (free) platforms such as like Jupyter Notebook and Anaconda.</p>\n<p>When you use Python to create a plot, you need to identify the sources of information using more than just the lines. Creating a plot that uses differing line types and data point symbols makes the plot much easier for other people to use. Following is a table that lists the line plot styles.</p>\n<table width=\"564\">\n<tbody>\n<tr>\n<td colspan=\"2\"><strong>Color</strong></td>\n<td colspan=\"2\"><strong>Marker</strong></td>\n<td colspan=\"2\"><strong>Style</strong></td>\n</tr>\n<tr>\n<td><strong>Code</strong></td>\n<td><strong>Line Color</strong></td>\n<td><strong>Code</strong></td>\n<td><strong>Marker Style</strong></td>\n<td><strong>Code</strong></td>\n<td><strong>Line Style</strong></td>\n</tr>\n<tr>\n<td>b</td>\n<td>blue</td>\n<td>.</td>\n<td>point</td>\n<td>–</td>\n<td>Solid</td>\n</tr>\n<tr>\n<td>g</td>\n<td>green</td>\n<td>o</td>\n<td>circle</td>\n<td>:</td>\n<td>Dotted</td>\n</tr>\n<tr>\n<td>r</td>\n<td>red</td>\n<td>x</td>\n<td>x-mark</td>\n<td>-.</td>\n<td>dash dot</td>\n</tr>\n<tr>\n<td>c</td>\n<td>cyan</td>\n<td>+</td>\n<td>plus</td>\n<td>—</td>\n<td>Dashed</td>\n</tr>\n<tr>\n<td>m</td>\n<td>magenta</td>\n<td>*</td>\n<td>star</td>\n<td>(none)</td>\n<td>no line</td>\n</tr>\n<tr>\n<td>y</td>\n<td>yellow</td>\n<td>s</td>\n<td>square</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td>k</td>\n<td>black</td>\n<td>d</td>\n<td>diamond</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td>w</td>\n<td>white</td>\n<td>v</td>\n<td>down triangle</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td></td>\n<td></td>\n<td>^</td>\n<td>up triangle</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td></td>\n<td></td>\n<td>&lt;</td>\n<td>left triangle</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td></td>\n<td></td>\n<td>&gt;</td>\n<td>right triangle</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td></td>\n<td></td>\n<td>p</td>\n<td>5-point star</td>\n<td></td>\n<td></td>\n</tr>\n<tr>\n<td></td>\n<td></td>\n<td>h</td>\n<td>6-point star</td>\n<td></td>\n<td></td>\n</tr>\n</tbody>\n</table>\n<p class=\"article-tips tip\">Remember that you can also use these styles with other kinds of plots. For example, a scatter plot can use these styles to define each of the data points. When in doubt, try the styles to see whether they’ll work with your particular plot.</p>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2024-04-12T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":301769},{"headers":{"creationTime":"2023-12-01T15:39:09+00:00","modifiedTime":"2023-12-01T15:39:09+00:00","timestamp":"2023-12-01T18:01:09+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"Big Data","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33578"},"slug":"big-data","categoryId":33578}],"title":"Beyond Boundaries: Unstructured Data Orchestration","strippedTitle":"beyond boundaries: unstructured data orchestration","slug":"beyond-boundaries-unstructured-data-orchestration","canonicalUrl":"","seo":{"metaDescription":"Find out what data orchestration is and how an unstructured data architecture can work for you.","noIndex":0,"noFollow":0},"content":"Getting the most out of your unstructured data is an essential task for any organization these days, especially when considering the disparate storage systems, applications, and user locations. So, it’s not an accident that data orchestration is the term that brings everything together.\r\n\r\nBringing all your data together shares similarities with conducting an orchestra. Instead of combining the violin, oboe, and cello, this brand of orchestration combines distributed data types from different places, platforms, and locations working as a cohesive entity presented to applications or users anywhere. That’s because historically, accessing high-performance data outside of your computer network was inefficient. Because the storage infrastructure existed in a silo, systems like HPC Parallel (which lets users store and access shared data across multiple networked storage nodes), Enterprise NAS (which allows large-scale storage and access to other networks), and Global Namespace (virtually simplifies network file systems) were limited when it came to sharing. Because each operated independently, the data within each system was siloed making it a problem collaborating with data sets over multiple locations.\r\n\r\nCollaboration was possible, but too often you lost the ability to have high performance. This Boolean logic decreased potential because having an IT architecture that supported both high performance and collaboration with data sets from different storage silos typically became an either/or decision: You were forced to choose one but never both.\r\n<h2 id=\"tab1\" >What is data orchestration?</h2>\r\nData orchestration is the automated process of taking siloed data from multiple data storage systems and locations, combining and organizing it into a single namespace. Then a high-performance file system can place data in the edge service, data center, or cloud service most optimal for the workload.\r\n\r\nThe recent rise of data analytic applications and artificial intelligence (AI) capabilities has accelerated the use of data across different locations and even different organizations. In the next data cycle, organizations will need both high-performance and agility with their data to compete and thrive in a competitive environment.\r\n\r\nThat means data no longer has a 1:1 relationship with the applications and compute environment that generated it. It needs to be used, analyzed, and repurposed with different AI models and alternate workloads, and across a remote, collaborative environment.\r\n<p class=\"remember\">Hammerspace’s technology makes data available to different foundational models, remote applications, decentralized compute clusters, and remote workers to automate and streamline data-driven development programs, data insights, and business decision making. This capability enables a unified, fast, and efficient global data environment for the entire workflow — from data creation to processing, collaboration, and archiving across edge devices, data centers, and public and private clouds.</p>\r\n\r\nControl of enterprise data services for governance, security, data protection, and compliance can now be implemented globally at a file-granular level across all storage types and locations. Applications and AI models can access data stored in remote locations while using automated orchestration tools to provide high-performance local access when needed for processing. Organizations can grow their talent pools with access to team members no matter where they reside.\r\n<h2 id=\"tab2\" >Decentralizing the data center</h2>\r\nData collection has become more prominent, and the traditional system of centralized data management has limitations. Issues of centralized data storage can limit the amount of data available to applications. Then, there are the high infrastructure costs when multiple applications are needed to manage and move data, multiple copies of data are retained in different storage systems, and more headcount is needed to manage the complex, disconnected infrastructure environment. Such setbacks suggest that the data center is no longer the center of data and storage system constraints should no longer define data architectures.\r\n\r\nHammerspace specializes in decentralized environments, where data may need to span two or more sites and possibly one or more cloud providers and regions, and/or where a remote workforce needs to collaborate in real time. It enables a global data environment by providing a unified, parallel global file system.\r\n<h2 id=\"tab3\" >Enabling a global data environment</h2>\r\nHammerspace completely revolutionizes previously held notions of how unstructured data architectures should be designed, delivering the performance needed across distributed environments to\r\n<ul>\r\n \t<li>Free workloads from data silos.</li>\r\n \t<li>Eliminate copy proliferation.</li>\r\n \t<li>Provide direct data access through local metadata to applications and users, no matter where the data is stored.</li>\r\n</ul>\r\nThis technology allows organizations to take full advantage of the performance capabilities of any server, storage system, and network anywhere in the world. This capability enables a unified, fast, and efficient global data environment for the entire workflow, from data creation to processing, collaboration, and archiving across edge devices, data centers, and public and private clouds.\r\n\r\nThe days of enterprises struggling with a siloed, distributed, and inefficient data environment are over. It’s time to start expecting more from data architectures with automated data orchestration. Find out how by downloading <em>Unstructured Data Orchestration For Dummies,</em> Hammerspace Special Edition, <a class=\"bookSponsor-btn\" href=\"https://hammerspace.com/for-dummies/\" target=\"_blank\" rel=\"noopener\" data-testid=\"bookSponsorDownloadButton\">here</a>.","description":"Getting the most out of your unstructured data is an essential task for any organization these days, especially when considering the disparate storage systems, applications, and user locations. So, it’s not an accident that data orchestration is the term that brings everything together.\r\n\r\nBringing all your data together shares similarities with conducting an orchestra. Instead of combining the violin, oboe, and cello, this brand of orchestration combines distributed data types from different places, platforms, and locations working as a cohesive entity presented to applications or users anywhere. That’s because historically, accessing high-performance data outside of your computer network was inefficient. Because the storage infrastructure existed in a silo, systems like HPC Parallel (which lets users store and access shared data across multiple networked storage nodes), Enterprise NAS (which allows large-scale storage and access to other networks), and Global Namespace (virtually simplifies network file systems) were limited when it came to sharing. Because each operated independently, the data within each system was siloed making it a problem collaborating with data sets over multiple locations.\r\n\r\nCollaboration was possible, but too often you lost the ability to have high performance. This Boolean logic decreased potential because having an IT architecture that supported both high performance and collaboration with data sets from different storage silos typically became an either/or decision: You were forced to choose one but never both.\r\n<h2 id=\"tab1\" >What is data orchestration?</h2>\r\nData orchestration is the automated process of taking siloed data from multiple data storage systems and locations, combining and organizing it into a single namespace. Then a high-performance file system can place data in the edge service, data center, or cloud service most optimal for the workload.\r\n\r\nThe recent rise of data analytic applications and artificial intelligence (AI) capabilities has accelerated the use of data across different locations and even different organizations. In the next data cycle, organizations will need both high-performance and agility with their data to compete and thrive in a competitive environment.\r\n\r\nThat means data no longer has a 1:1 relationship with the applications and compute environment that generated it. It needs to be used, analyzed, and repurposed with different AI models and alternate workloads, and across a remote, collaborative environment.\r\n<p class=\"remember\">Hammerspace’s technology makes data available to different foundational models, remote applications, decentralized compute clusters, and remote workers to automate and streamline data-driven development programs, data insights, and business decision making. This capability enables a unified, fast, and efficient global data environment for the entire workflow — from data creation to processing, collaboration, and archiving across edge devices, data centers, and public and private clouds.</p>\r\n\r\nControl of enterprise data services for governance, security, data protection, and compliance can now be implemented globally at a file-granular level across all storage types and locations. Applications and AI models can access data stored in remote locations while using automated orchestration tools to provide high-performance local access when needed for processing. Organizations can grow their talent pools with access to team members no matter where they reside.\r\n<h2 id=\"tab2\" >Decentralizing the data center</h2>\r\nData collection has become more prominent, and the traditional system of centralized data management has limitations. Issues of centralized data storage can limit the amount of data available to applications. Then, there are the high infrastructure costs when multiple applications are needed to manage and move data, multiple copies of data are retained in different storage systems, and more headcount is needed to manage the complex, disconnected infrastructure environment. Such setbacks suggest that the data center is no longer the center of data and storage system constraints should no longer define data architectures.\r\n\r\nHammerspace specializes in decentralized environments, where data may need to span two or more sites and possibly one or more cloud providers and regions, and/or where a remote workforce needs to collaborate in real time. It enables a global data environment by providing a unified, parallel global file system.\r\n<h2 id=\"tab3\" >Enabling a global data environment</h2>\r\nHammerspace completely revolutionizes previously held notions of how unstructured data architectures should be designed, delivering the performance needed across distributed environments to\r\n<ul>\r\n \t<li>Free workloads from data silos.</li>\r\n \t<li>Eliminate copy proliferation.</li>\r\n \t<li>Provide direct data access through local metadata to applications and users, no matter where the data is stored.</li>\r\n</ul>\r\nThis technology allows organizations to take full advantage of the performance capabilities of any server, storage system, and network anywhere in the world. This capability enables a unified, fast, and efficient global data environment for the entire workflow, from data creation to processing, collaboration, and archiving across edge devices, data centers, and public and private clouds.\r\n\r\nThe days of enterprises struggling with a siloed, distributed, and inefficient data environment are over. It’s time to start expecting more from data architectures with automated data orchestration. Find out how by downloading <em>Unstructured Data Orchestration For Dummies,</em> Hammerspace Special Edition, <a class=\"bookSponsor-btn\" href=\"https://hammerspace.com/for-dummies/\" target=\"_blank\" rel=\"noopener\" data-testid=\"bookSponsorDownloadButton\">here</a>.","blurb":"","authors":[{"authorId":9204,"name":"John Carucci","slug":"john-carucci","description":" <p><b>John Carucci </b>is not a celebrity, though he certainly brushes up against the stars of stage and screen on a regular basis in his role as an Entertainment TV Producer with the Associated Press. Along with hobnobbing with actors and musicians, John is also author of <i>Digital SLR Video & Filmmaking For Dummies</i> and two editions of <i>GoPro Cameras For Dummies</i>.</p> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9204"}}],"primaryCategoryTaxonomy":{"categoryId":33578,"title":"Big Data","slug":"big-data","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33578"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":[{"articleId":192609,"title":"How to Pray the Rosary: A Comprehensive Guide","slug":"how-to-pray-the-rosary","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/192609"}},{"articleId":208741,"title":"Kabbalah For Dummies Cheat Sheet","slug":"kabbalah-for-dummies-cheat-sheet","categoryList":["body-mind-spirit","religion-spirituality","kabbalah"],"_links":{"self":"/articles/208741"}},{"articleId":230957,"title":"Nikon D3400 For Dummies Cheat Sheet","slug":"nikon-d3400-dummies-cheat-sheet","categoryList":["home-auto-hobbies","photography"],"_links":{"self":"/articles/230957"}},{"articleId":235851,"title":"Praying the Rosary and Meditating on the Mysteries","slug":"praying-rosary-meditating-mysteries","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/235851"}},{"articleId":284787,"title":"What Your Society Says About You","slug":"what-your-society-says-about-you","categoryList":["academics-the-arts","humanities"],"_links":{"self":"/articles/284787"}}],"inThisArticle":[{"label":"What is data orchestration?","target":"#tab1"},{"label":"Decentralizing the data center","target":"#tab2"},{"label":"Enabling a global data environment","target":"#tab3"}],"relatedArticles":{"fromBook":[],"fromCategory":[{"articleId":207996,"title":"Big Data For Dummies Cheat Sheet","slug":"big-data-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/207996"}},{"articleId":207478,"title":"Statistics for Big Data For Dummies Cheat Sheet","slug":"statistics-for-big-data-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/207478"}},{"articleId":207432,"title":"Big Data for Small Business For Dummies Cheat Sheet","slug":"big-data-for-small-business-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/207432"}},{"articleId":168988,"title":"Integrate Big Data with the Traditional Data Warehouse","slug":"integrate-big-data-with-the-traditional-data-warehouse","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/168988"}},{"articleId":168986,"title":"Big Data Planning Stages","slug":"big-data-planning-stages","categoryList":["technology","information-technology","data-science","big-data"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/168986"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":0,"slug":null,"isbn":null,"categoryList":null,"amazon":null,"image":null,"title":null,"testBankPinActivationLink":null,"bookOutOfPrint":false,"authorsInfo":null,"authors":null,"_links":null},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;big-data&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[null]},{&quot;key&quot;:&quot;sponsored&quot;,&quot;values&quot;:[&quot;customsolutions&quot;]}]\" id=\"du-slot-656a1f65440bd\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;big-data&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[null]},{&quot;key&quot;:&quot;sponsored&quot;,&quot;values&quot;:[&quot;customsolutions&quot;]}]\" id=\"du-slot-656a1f6544d68\"></div></div>"},"articleType":{"articleType":"Articles","articleList":null,"content":null,"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":true,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"Brought to you by Hammerspace","brandingLink":"https://hammerspace.com/","brandingLogo":{"src":"https://www.dummies.com/wp-content/uploads/hammerspace-logo-266x55-1.png","width":266,"height":55},"sponsorAd":"","sponsorEbookTitle":"Unstructured Data Orchestration For Dummies, Hammerspace Special Edition","sponsorEbookLink":"https://hammerspace.com/for-dummies/","sponsorEbookImage":{"src":"https://www.dummies.com/wp-content/uploads/unstructured-data-orchestration-hammerspace-special-edition-cover-9781394211364-165x255.jpg","width":165,"height":255}},"primaryLearningPath":"Solve","lifeExpectancy":"One year","lifeExpectancySetFrom":"2023-12-05T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[{"adPairKey":"sponsored","adPairValue":"customsolutions"}]},"status":"publish","visibility":"public","articleId":301250},{"headers":{"creationTime":"2017-04-18T00:55:51+00:00","modifiedTime":"2023-07-27T12:42:43+00:00","timestamp":"2023-07-27T15:01:03+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"E-Commerce and Data Testing Tactics","strippedTitle":"e-commerce and data testing tactics","slug":"e-commerce-data-testing-tactics","canonicalUrl":"","seo":{"metaDescription":"In growth, you use testing methods to optimize your web design and messaging so that it performs at its absolute best with the audiences to which it's targeted.","noIndex":0,"noFollow":0},"content":"In growth, you use testing methods to optimize your web design and messaging so that it performs at its absolute best with the audiences to which it's targeted. Although testing and web analytics methods are both intended to optimize performance, testing goes one layer deeper than web analytics. You use web analytics to get a general idea about the interests of your channel audiences and how well your marketing efforts are paying off over time.\r\n\r\nAfter you have this information, you can then go in deeper to test variations on live visitors in order to gain empirical evidence about what designs and messaging your visitors actually prefer.\r\n\r\nTesting tactics can help you optimize your website design or brand messaging for increased conversions in all layers of the funnel. Testing is also useful when optimizing your landing pages for user activations and revenue conversions.\r\n<h2 id=\"tab1\" >Checking out common types of testing in growth</h2>\r\nWhen you use data insights to increase growth for e-commerce businesses, you're likely to run into the three following testing tactics: A/B split testing, multivariate testing, and mouse-click heat map analytics.\r\n\r\nAn <em>A/B split test</em> is an optimization tactic you can use to split variations of your website or brand messaging between sets of live audiences in order to gauge responses and decide which of the two variations performs best. A/B split testing is the simplest testing method you can use for website or messaging optimization.\r\n\r\n<em>Multivariate testing</em> is, in many ways, similar to the multivariate regression analysis that I discuss in Chapter 5. Like multivariate regression analysis, multivariate testing allows you to uncover relationships, correlations, and causations between variables and outcomes. In the case of multivariate testing, you're testing several conversion factors simultaneously over an extended period in order to uncover which factors are responsible for increased conversions. Multivariate testing is more complicated than A/B split testing, but it usually provides quicker and more powerful results.\r\n\r\nLastly, you can use <em>mouse-click heat map analytics</em> to see how visitors are responding to your design and messaging choices. In this type of testing, you use the mouse-click heat map to help you make optimal website design and messaging choices to ensure that you're doing everything you can to keep your visitors focused and converting.\r\n<p class=\"article-tips remember\">Landing pages are meant to offer visitors little to no options, except to convert or to exit the page. Because a visitor has so few options on what he can do on a landing page, you don't really need to use multivariate testing or website mouse-click heat maps. Simple A/B split tests suffice.</p>\r\nData scientists working in growth hacking should be familiar with (and know how to derive insight from) the following testing applications:\r\n<ul>\r\n\t<li><strong><a href=\"http://webtrends.com/\" target=\"_blank\" rel=\"noopener\">Webtrends</a>:</strong> Offers a conversion-optimization feature that includes functionality for A/B split testing and multivariate testing.</li>\r\n\t<li><strong><a href=\"http://www.optimizely.com/\" target=\"_blank\" rel=\"noopener\">Optimizely</a>:</strong> A popular product among the growth-hacking community. You can use Optimizely for multipage funnel testing, A/B split testing, and multivariate testing, among other things.</li>\r\n\t<li><strong><a href=\"https://vwo.com/\" target=\"_blank\" rel=\"noopener\">Visual Website Optimizer</a>:</strong> An excellent tool for A/B split testing and multivariate testing.</li>\r\n</ul>\r\n<h2 id=\"tab2\" >Testing for acquisitions</h2>\r\nAcquisitions testing provides feedback on how well your content performs with prospective users in your assorted channels. You can use acquisitions testing to help compare your message's performance in each channel, helping you optimize your messaging on a per-channel basis. If you want to optimize the performance of your brand's published images, you can use acquisition testing to compare image performance across your channels as well. Lastly, if you want to increase your acquisitions through increases in user referrals, use testing to help optimize your referrals messaging for the referrals channels. Acquisition testing can help you begin to understand the specific preferences of prospective users on a channel-by-channel basis. You can use A/B split testing to improve your acquisitions in the following ways:\r\n<ul>\r\n\t<li><strong>Social messaging optimization:</strong> After you use social analytics to deduce the general interests and preferences of users in each of your social channels, you can then further optimize your brand messaging along those channels by using A/B split testing to compare your headlines and social media messaging within each channel.</li>\r\n\t<li><strong>Brand image and messaging optimization:</strong> Compare and optimize the respective performances of images along each of your social channels.</li>\r\n\t<li><strong>Optimized referral messaging:</strong> Test the effectiveness of your email messaging at converting new user referrals.</li>\r\n</ul>\r\n<h2 id=\"tab3\" >Testing for activations</h2>\r\nActivation testing provides feedback on how well your website and its content perform in converting acquired users to active users. The results of activation testing can help you optimize your website and landing pages for maximum sign-ups and subscriptions. Here's how you'd use testing methods to optimize user activation growth:\r\n<ul>\r\n\t<li><strong>Website conversion optimization:</strong> Make sure your website is optimized for user activation conversions. You can use A/B split testing, multivariate testing, or a mouse-click heat map data visualization to help you optimize your website design.</li>\r\n\t<li><strong>Landing pages:</strong> If your landing page has a simple call to action that prompts guests to subscribe to your email list, you can use A/B split testing for simple design optimization of this page and the call-to-action messaging.</li>\r\n</ul>\r\n<h2 id=\"tab4\" >Testing for retentions</h2>\r\nRetentions testing provides feedback on how well your blog post and email headlines are performing among your base of activated users. If you want to optimize your headlines so that active users want to continue active engagements with your brand, test the performance of your user-retention tactics. Here's how you can use testing methods to optimize user retention growth:\r\n<ul>\r\n\t<li><strong>Headline optimization:</strong> Use A/B split testing to optimize the headlines of your blog posts and email marketing messages. Test different headline varieties within your different channels, and then use the varieties that perform the best. Email open rates and RSS view rates are ideal metrics to track the performance of each headline variation.</li>\r\n\t<li><strong>Conversion rate optimization:</strong> Use A/B split testing on the messaging within your emails to decide which messaging variety more effectively gets your activated users to engage with your brand. The more effective your email messaging is at getting activated users to take a desired action, the greater your user retention rates.</li>\r\n</ul>\r\n<h2 id=\"tab5\" >Testing for revenue growth</h2>\r\nRevenue testing gauges the performance of revenue-generating landing pages, e-commerce pages, and brand messaging. Revenue testing methods can help you optimize your landing and e-commerce pages for sales conversions. Here's how you can use testing methods to optimize revenue growth:\r\n<ul>\r\n\t<li><strong>Website conversion optimization:</strong> You can use A/B split testing, multivariate testing, or a mouse-click heat map data visualization to help optimize your sales page and shopping cart design for revenue-generating conversions.</li>\r\n\t<li><strong>Landing page optimization:</strong> If you have a landing page with a simple call to action that prompts guests to make a purchase, you can use A/B split testing for design optimization.</li>\r\n</ul>","description":"In growth, you use testing methods to optimize your web design and messaging so that it performs at its absolute best with the audiences to which it's targeted. Although testing and web analytics methods are both intended to optimize performance, testing goes one layer deeper than web analytics. You use web analytics to get a general idea about the interests of your channel audiences and how well your marketing efforts are paying off over time.\r\n\r\nAfter you have this information, you can then go in deeper to test variations on live visitors in order to gain empirical evidence about what designs and messaging your visitors actually prefer.\r\n\r\nTesting tactics can help you optimize your website design or brand messaging for increased conversions in all layers of the funnel. Testing is also useful when optimizing your landing pages for user activations and revenue conversions.\r\n<h2 id=\"tab1\" >Checking out common types of testing in growth</h2>\r\nWhen you use data insights to increase growth for e-commerce businesses, you're likely to run into the three following testing tactics: A/B split testing, multivariate testing, and mouse-click heat map analytics.\r\n\r\nAn <em>A/B split test</em> is an optimization tactic you can use to split variations of your website or brand messaging between sets of live audiences in order to gauge responses and decide which of the two variations performs best. A/B split testing is the simplest testing method you can use for website or messaging optimization.\r\n\r\n<em>Multivariate testing</em> is, in many ways, similar to the multivariate regression analysis that I discuss in Chapter 5. Like multivariate regression analysis, multivariate testing allows you to uncover relationships, correlations, and causations between variables and outcomes. In the case of multivariate testing, you're testing several conversion factors simultaneously over an extended period in order to uncover which factors are responsible for increased conversions. Multivariate testing is more complicated than A/B split testing, but it usually provides quicker and more powerful results.\r\n\r\nLastly, you can use <em>mouse-click heat map analytics</em> to see how visitors are responding to your design and messaging choices. In this type of testing, you use the mouse-click heat map to help you make optimal website design and messaging choices to ensure that you're doing everything you can to keep your visitors focused and converting.\r\n<p class=\"article-tips remember\">Landing pages are meant to offer visitors little to no options, except to convert or to exit the page. Because a visitor has so few options on what he can do on a landing page, you don't really need to use multivariate testing or website mouse-click heat maps. Simple A/B split tests suffice.</p>\r\nData scientists working in growth hacking should be familiar with (and know how to derive insight from) the following testing applications:\r\n<ul>\r\n\t<li><strong><a href=\"http://webtrends.com/\" target=\"_blank\" rel=\"noopener\">Webtrends</a>:</strong> Offers a conversion-optimization feature that includes functionality for A/B split testing and multivariate testing.</li>\r\n\t<li><strong><a href=\"http://www.optimizely.com/\" target=\"_blank\" rel=\"noopener\">Optimizely</a>:</strong> A popular product among the growth-hacking community. You can use Optimizely for multipage funnel testing, A/B split testing, and multivariate testing, among other things.</li>\r\n\t<li><strong><a href=\"https://vwo.com/\" target=\"_blank\" rel=\"noopener\">Visual Website Optimizer</a>:</strong> An excellent tool for A/B split testing and multivariate testing.</li>\r\n</ul>\r\n<h2 id=\"tab2\" >Testing for acquisitions</h2>\r\nAcquisitions testing provides feedback on how well your content performs with prospective users in your assorted channels. You can use acquisitions testing to help compare your message's performance in each channel, helping you optimize your messaging on a per-channel basis. If you want to optimize the performance of your brand's published images, you can use acquisition testing to compare image performance across your channels as well. Lastly, if you want to increase your acquisitions through increases in user referrals, use testing to help optimize your referrals messaging for the referrals channels. Acquisition testing can help you begin to understand the specific preferences of prospective users on a channel-by-channel basis. You can use A/B split testing to improve your acquisitions in the following ways:\r\n<ul>\r\n\t<li><strong>Social messaging optimization:</strong> After you use social analytics to deduce the general interests and preferences of users in each of your social channels, you can then further optimize your brand messaging along those channels by using A/B split testing to compare your headlines and social media messaging within each channel.</li>\r\n\t<li><strong>Brand image and messaging optimization:</strong> Compare and optimize the respective performances of images along each of your social channels.</li>\r\n\t<li><strong>Optimized referral messaging:</strong> Test the effectiveness of your email messaging at converting new user referrals.</li>\r\n</ul>\r\n<h2 id=\"tab3\" >Testing for activations</h2>\r\nActivation testing provides feedback on how well your website and its content perform in converting acquired users to active users. The results of activation testing can help you optimize your website and landing pages for maximum sign-ups and subscriptions. Here's how you'd use testing methods to optimize user activation growth:\r\n<ul>\r\n\t<li><strong>Website conversion optimization:</strong> Make sure your website is optimized for user activation conversions. You can use A/B split testing, multivariate testing, or a mouse-click heat map data visualization to help you optimize your website design.</li>\r\n\t<li><strong>Landing pages:</strong> If your landing page has a simple call to action that prompts guests to subscribe to your email list, you can use A/B split testing for simple design optimization of this page and the call-to-action messaging.</li>\r\n</ul>\r\n<h2 id=\"tab4\" >Testing for retentions</h2>\r\nRetentions testing provides feedback on how well your blog post and email headlines are performing among your base of activated users. If you want to optimize your headlines so that active users want to continue active engagements with your brand, test the performance of your user-retention tactics. Here's how you can use testing methods to optimize user retention growth:\r\n<ul>\r\n\t<li><strong>Headline optimization:</strong> Use A/B split testing to optimize the headlines of your blog posts and email marketing messages. Test different headline varieties within your different channels, and then use the varieties that perform the best. Email open rates and RSS view rates are ideal metrics to track the performance of each headline variation.</li>\r\n\t<li><strong>Conversion rate optimization:</strong> Use A/B split testing on the messaging within your emails to decide which messaging variety more effectively gets your activated users to engage with your brand. The more effective your email messaging is at getting activated users to take a desired action, the greater your user retention rates.</li>\r\n</ul>\r\n<h2 id=\"tab5\" >Testing for revenue growth</h2>\r\nRevenue testing gauges the performance of revenue-generating landing pages, e-commerce pages, and brand messaging. Revenue testing methods can help you optimize your landing and e-commerce pages for sales conversions. Here's how you can use testing methods to optimize revenue growth:\r\n<ul>\r\n\t<li><strong>Website conversion optimization:</strong> You can use A/B split testing, multivariate testing, or a mouse-click heat map data visualization to help optimize your sales page and shopping cart design for revenue-generating conversions.</li>\r\n\t<li><strong>Landing page optimization:</strong> If you have a landing page with a simple call to action that prompts guests to make a purchase, you can use A/B split testing for design optimization.</li>\r\n</ul>","blurb":"","authors":[{"authorId":9232,"name":"Lillian Pierson","slug":"lillian-pierson","description":" <p><b>Lillian Pierson</b> is the CEO of Data-Mania, where she supports data professionals in transforming into world-class leaders and entrepreneurs. She has trained well over one million individuals on the topics of AI and data science. Lillian has assisted global leaders in IT, government, media organizations, and nonprofits.</p> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9232"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General Data Science","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":[{"articleId":192609,"title":"How to Pray the Rosary: A Comprehensive Guide","slug":"how-to-pray-the-rosary","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/192609"}},{"articleId":208741,"title":"Kabbalah For Dummies Cheat Sheet","slug":"kabbalah-for-dummies-cheat-sheet","categoryList":["body-mind-spirit","religion-spirituality","kabbalah"],"_links":{"self":"/articles/208741"}},{"articleId":230957,"title":"Nikon D3400 For Dummies Cheat Sheet","slug":"nikon-d3400-dummies-cheat-sheet","categoryList":["home-auto-hobbies","photography"],"_links":{"self":"/articles/230957"}},{"articleId":235851,"title":"Praying the Rosary and Meditating on the Mysteries","slug":"praying-rosary-meditating-mysteries","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/235851"}},{"articleId":284787,"title":"What Your Society Says About You","slug":"what-your-society-says-about-you","categoryList":["academics-the-arts","humanities"],"_links":{"self":"/articles/284787"}}],"inThisArticle":[{"label":"Checking out common types of testing in growth","target":"#tab1"},{"label":"Testing for acquisitions","target":"#tab2"},{"label":"Testing for activations","target":"#tab3"},{"label":"Testing for retentions","target":"#tab4"},{"label":"Testing for revenue growth","target":"#tab5"}],"relatedArticles":{"fromBook":[{"articleId":238086,"title":"Data Journalism: Collecting Data for Your Story","slug":"data-journalism-collecting-data-story","categoryList":["technology","computers","macs","general-macs"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238086"}},{"articleId":238083,"title":"Data Journalism: How to Develop, Tell, and Present the Story","slug":"data-journalism-develop-tell-present-story","categoryList":["technology","computers","macs","general-macs"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238083"}},{"articleId":238080,"title":"Data Journalism: Why the Story Matters","slug":"data-journalism-story-matters","categoryList":["technology","computers","macs","general-macs"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238080"}},{"articleId":238077,"title":"The Where in Data Journalism","slug":"the-where-in-data-journalism","categoryList":["technology","computers","macs","general-macs"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238077"}},{"articleId":238074,"title":"The When in Data Journalism","slug":"the-when-in-data-journalism","categoryList":["technology","computers","macs","general-macs"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238074"}}],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281677,"slug":"data-science-for-dummies","isbn":"9781119811558","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119811554/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119811554/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119811554-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119811554/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119811554/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/data-science-for-dummies-3rd-edition-cover-9781119811558-203x255.jpg","width":203,"height":255},"title":"Data Science For Dummies","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"<p><b><b data-author-id=\"9232\">Lillian Pierson</b></b> is the CEO of Data-Mania, where she supports data professionals in transforming into world-class leaders and entrepreneurs. She has trained well over one million individuals on the topics of AI and data science. Lillian has assisted global leaders in IT, government, media organizations, and nonprofits.</p>","authors":[{"authorId":9232,"name":"Lillian Pierson","slug":"lillian-pierson","description":" <p><b>Lillian Pierson</b> is the CEO of Data-Mania, where she supports data professionals in transforming into world-class leaders and entrepreneurs. She has trained well over one million individuals on the topics of AI and data science. Lillian has assisted global leaders in IT, government, media organizations, and nonprofits.</p> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9232"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119811558&quot;]}]\" id=\"du-slot-64c286af22456\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119811558&quot;]}]\" id=\"du-slot-64c286af22c54\"></div></div>"},"articleType":{"articleType":"Articles","articleList":null,"content":null,"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"Two years","lifeExpectancySetFrom":"2023-07-27T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":238044},{"headers":{"creationTime":"2020-09-10T21:44:16+00:00","modifiedTime":"2023-07-24T14:51:59+00:00","timestamp":"2023-07-24T15:01:03+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"Blockchain Data Analytics For Dummies Cheat Sheet","strippedTitle":"blockchain data analytics for dummies cheat sheet","slug":"blockchain-data-analytics-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"To best leverage blockchain data analytics, become familiar with blockchain technology, how it stores data, and how to extract and analyze data.","noIndex":0,"noFollow":0},"content":"Blockchain technology is much more than just another way to store data. It's a radical new method of storing validated data and transaction information in an indelible, trusted repository. Blockchain has the potential to disrupt business as we know it, and in the process, provide a rich new source of behavioral data. Data analysts have long found valuable insights from historical data, and blockchain can expose new and reliable data to drive business strategy. To best leverage the value that blockchain data offers, become familiar with <a href=\"https://www.dummies.com/article/business-careers-money/personal-finance/cryptocurrency/what-is-a-blockchain-and-how-does-it-work-262432/\">blockchain technology</a> and how it stores data, and learn how to extract and analyze this data.\r\n\r\n[caption id=\"attachment_273254\" align=\"alignnone\" width=\"556\"]<img class=\"size-full wp-image-273254\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics.jpg\" alt=\"blockchain data analytics\" width=\"556\" height=\"371\" /> © everything possible / Shutterstock.com[/caption]","description":"Blockchain technology is much more than just another way to store data. It's a radical new method of storing validated data and transaction information in an indelible, trusted repository. Blockchain has the potential to disrupt business as we know it, and in the process, provide a rich new source of behavioral data. Data analysts have long found valuable insights from historical data, and blockchain can expose new and reliable data to drive business strategy. To best leverage the value that blockchain data offers, become familiar with <a href=\"https://www.dummies.com/article/business-careers-money/personal-finance/cryptocurrency/what-is-a-blockchain-and-how-does-it-work-262432/\">blockchain technology</a> and how it stores data, and learn how to extract and analyze this data.\r\n\r\n[caption id=\"attachment_273254\" align=\"alignnone\" width=\"556\"]<img class=\"size-full wp-image-273254\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics.jpg\" alt=\"blockchain data analytics\" width=\"556\" height=\"371\" /> © everything possible / Shutterstock.com[/caption]","blurb":"","authors":[{"authorId":27199,"name":"Michael G. Solomon","slug":"michael-solomon","description":"Michael G. Solomon, PhD, is a professor at the University of the Cumberlands who specializes in courses on blockchain and distributed computing systems as well as computer security. He holds numerous security and project management certifications and has written several books on security and project management, including Ethereum For Dummies. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/27199"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General Data Science","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":[{"articleId":192609,"title":"How to Pray the Rosary: A Comprehensive Guide","slug":"how-to-pray-the-rosary","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/192609"}},{"articleId":208741,"title":"Kabbalah For Dummies Cheat Sheet","slug":"kabbalah-for-dummies-cheat-sheet","categoryList":["body-mind-spirit","religion-spirituality","kabbalah"],"_links":{"self":"/articles/208741"}},{"articleId":230957,"title":"Nikon D3400 For Dummies Cheat Sheet","slug":"nikon-d3400-dummies-cheat-sheet","categoryList":["home-auto-hobbies","photography"],"_links":{"self":"/articles/230957"}},{"articleId":235851,"title":"Praying the Rosary and Meditating on the Mysteries","slug":"praying-rosary-meditating-mysteries","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/235851"}},{"articleId":284787,"title":"What Your Society Says About You","slug":"what-your-society-says-about-you","categoryList":["academics-the-arts","humanities"],"_links":{"self":"/articles/284787"}}],"inThisArticle":[],"relatedArticles":{"fromBook":[],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281640,"slug":"blockchain-data-analytics-for-dummies","isbn":"9781119651772","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119651778/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119651778/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119651778-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119651778/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119651778/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-for-dummies-cover-9781119651772-203x255.jpg","width":203,"height":255},"title":"Blockchain Data Analytics For Dummies","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"<p><b>Kiana Danial</b> is an investment trainer and consultant as well as the author of <i>Cryptocurrency Investing For Dummies.</i></p> <p><b>Peter Kent</b> is a veteran technology author. <b>Tyler Bain </b>is a Certified Bitcoin Professional. Peter and Tyler are co-authors of <i>Cryptocurrency Mining For Dummies</i>. <b>Tiana Laurence </b>heads her own venture capital firm and is author of <i>Blockchain For Dummies</i>, 2nd Edition. <b><b data-author-id=\"34832\">Michael G. Solomon</b>, PhD,</b> is a professor of Computer Information Sciences as well as author of <i>Ethereum For Dummies</i>.</p>","authors":[{"authorId":34832,"name":"Michael G. Solomon","slug":"michael-g-solomon","description":" <p><b>Kiana Danial</b> is an investment trainer and consultant as well as the author of <i>Cryptocurrency Investing For Dummies.</i></p> <p><b>Peter Kent</b> is a veteran technology author. <b>Tyler Bain </b>is a Certified Bitcoin Professional. Peter and Tyler are co-authors of <i>Cryptocurrency Mining For Dummies</i>. <b>Tiana Laurence </b>heads her own venture capital firm and is author of <i>Blockchain For Dummies</i>, 2nd Edition. <b>Michael G. Solomon, PhD,</b> is a professor of Computer Information Sciences as well as author of <i>Ethereum For Dummies</i>.</p>","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/34832"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119651772&quot;]}]\" id=\"du-slot-64be922f68ca6\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119651772&quot;]}]\" id=\"du-slot-64be922f6949f\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":0,"title":"","slug":null,"categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/"}}],"content":[{"title":"Blockchain quick facts","thumb":null,"image":null,"content":"<p>Blockchain technology is a fast-growing disruptive technology that enables verified data to be shared among a set of untrusted parties. Moreover, blockchain makes it possible to share ledgers of items of value and control the exchange of these items in an untrusted environment.</p>\n<p>Today’s blockchains come in public, private, and hybrid versions that support complex software applications. Learning about blockchain technology will help you understand how people and organizations will conduct business in the near future and beyond.</p>\n<p>The following list summarizes some of blockchain technology’s most important features:</p>\n<ul>\n<li>A <em>blockchain</em> is a chain of blocks, where each block stores the mathematical hash value of the previous block, connecting it to its predecessor.</li>\n<li>Any changes to a block invalidates that block and all subsequent blocks in the chain.</li>\n<li>The most common use of a blockchain is to transfer some item of value (crypto-asset) from one account to another.</li>\n<li>Crypto-asset owners are identified by an <em>address,</em> which is related to the account’s public encryption key.</li>\n<li>Crypto-assets exist only in a blockchain, and each one is associated with some value in the real world.</li>\n<li>A transaction transfers a crypto-asset from one account (owner) to another.</li>\n<li>All transactions are digitally signed with the crypto-asset owner’s private key.</li>\n<li>Cryptography makes it easy to verify a transaction’s owner (digital signature) and a block’s integrity (hash).</li>\n<li><em>Smart contracts</em> are programs that define rules that control how data gets added to, and read from, the blockchain.</li>\n<li>Smart contracts must run the same way, and produce the same results, on every network node instance.</li>\n<li>Like data, smart contract code is stored in a blockchain block and can never be changed.</li>\n<li>Databases support create, read, update, and delete (CRUD) operations, whereas blockchains support only read and write operations.</li>\n<li>Blockchain’s add-only property keeps blocks in chronological order and makes it easy to trace a crypto-asset throughout its lifecycle (forward and backward.)</li>\n<li>The most popular consensus mechanism used today is Proof-of-Work (PoW), which requires enormous energy. However, other consensus algorithms, such as Proof-of-Stake (PoC), are becoming more popular which can help blockchains handle transactions faster and make the technology a better fit for more applications.</li>\n</ul>\n"},{"title":"Data analytics quick facts","thumb":null,"image":null,"content":"<p>Data analytics is all about finding hidden nuggets of valuable information in data. If the information you’re looking for were easy to find, you wouldn’t need analytics.</p>\n<p>The real power of data analytics is in its capability to learn from the past in ways that can help you improve the chances for success in the future. Success might be measured as increased sales, reduced costs, or having the right products in the right place at the right time.</p>\n<p>Understanding the different analytics models and their use is key to unlocking your data’s secrets. The following list summarizes some of the most common analytics techniques and models you’ll use when analyzing blockchain data:</p>\n<ul>\n<li>Every analytics model should exist to satisfy a specific business goal.</li>\n<li>Before starting to select any analytics model, create a data inventory (of on-chain and off-chain data).</li>\n<li>Build an analytics lab that allows you to experiment in an isolated environment.</li>\n<li>Determine your primary goals: identification, explanation, prediction, or any combination thereof.</li>\n<li>Clustering models show relationships among objects (identification).</li>\n<li>Association models can reveal objects that frequently exist together in transactions (explanation).</li>\n<li>Classification models determine to which group a new object belongs (identification).</li>\n<li>Prediction models predict future outcomes based on historical data (prediction).</li>\n<li>Object characteristics, or attributes, are often called <em>features</em>.</li>\n<li>A model’s output quality depends on selecting the best features to analyze.</li>\n<li><em>Scatterplots</em> can help determine which feature sets affect outcomes.</li>\n<li><em>K-means</em> is a popular clustering algorithm that reveals relationships between objects by identifying object clusters.</li>\n<li><em>Apriori</em> is a useful algorithm for showing objects that occur together in transactions frequently (market basket analysis).</li>\n<li><em>Decision tree</em> and <em>naïve Bayes</em> are both classification algorithms that help determine ways to label objects based on a limited number of labels.</li>\n<li>Regression algorithms (<em>linear regression</em> for continuous data and <em>logistic regression</em> for categorical data) can predict future behavior based on historical data.</li>\n<li><em>Time-series analysis</em> algorithms can remove cyclic and seasonal variations to reveal trends in data.</li>\n<li>Creating simple and effective visualizations of model results is important to communicate your results.</li>\n<li>Every analytics model should be validated with metrics to assess its accuracy and output significance.</li>\n</ul>\n"},{"title":"Extracting data from a blockchain quick facts","thumb":null,"image":null,"content":"<p>Data analytics models rely on data. You’ll need data to choose the right model, build it, train it, and then run it using new data. The process of preparing input data for an analytics model can be tedious.</p>\n<p>In a blockchain environment, building an analytics dataset includes identifying the data you’ll need, fetching it from the blockchain, and then completing the data picture with related data from off-chain repositories. Although you follow the same basic steps each time you populate a model, adding blockchain to the process adds another layer of requirements. The following list summarizes common concepts for accessing blockchain data and techniques for building a dataset from a blockchain:</p>\n<ul>\n<li>Use a blockchain explorer to examine all blockchains of interest for data.</li>\n<li>Smart contracts store state data and generate transaction data.</li>\n<li>Smart contracts generate events that result in log file entries (valuable for a timestamped record of actions).</li>\n<li>Build an analytics lab to support connecting to all blockchains of interest using your favorite language, such as Python or JavaScript.</li>\n<li>Decide between extracting data from the blockchain first (faster for multiple runs) or letting the analytics model collect blockchain data in real time (access to the latest data).</li>\n<li>When a model needs off-chain data as well as on-chain data, prebuilding datasets is generally easier.</li>\n<li>Establish a strategy for aligning on-chain and off-chain identities.</li>\n<li>Access blockchain data through event filters, smart contract functions, or state data queries.</li>\n<li>Cleanse blockchain data and convert any formats or types as necessary.</li>\n<li>Identify related off-chain data (or on-chain data from another blockchain) and fetch related data.</li>\n<li>Either store full data in a format suited for reading into a dataframe or develop conversion code.</li>\n<li>Identify partitions in your dataset to use for training and testing your model. Quasi-random object selection for each partition increases your model’s accuracy.</li>\n<li>Devise a strategy for updating extracted data with fresh data.</li>\n<li>Plan to re-train models as your dataset changes.</li>\n</ul>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2022-02-25T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":273253},{"headers":{"creationTime":"2020-12-29T21:13:08+00:00","modifiedTime":"2023-07-24T14:42:04+00:00","timestamp":"2023-07-24T15:01:03+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"The Primary Types of Blockchain","strippedTitle":"the primary types of blockchain","slug":"the-primary-types-of-blockchain","canonicalUrl":"","seo":{"metaDescription":"Have you ever wondered what the different types of blockchain look like? Use this guide from Dummies.com to learn how they compare.","noIndex":0,"noFollow":0},"content":"In 2008, Bitcoin was the only <a href=\"https://www.dummies.com/article/business-careers-money/personal-finance/cryptocurrency/brief-history-bitcoin-blockchain-242874/\">blockchain implementation</a>. At that time, Bitcoin and blockchain were synonymous. Now hundreds of different blockchain implementations exist. Each new blockchain implementation emerges to address a particular need and each one is unique. However, blockchains tend to share many features with other blockchains. Before examining blockchain applications and data, it helps to look at their similarities.\r\n\r\n[caption id=\"attachment_275189\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275189 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockcahin-data-analytics-types.png\" alt=\"blockchain\" width=\"556\" height=\"324\" /> ©Shutterstock/phive[/caption]\r\n<p class=\"article-tips tip\">Check out this article to <a href=\"https://www.dummies.com/article/business-careers-money/personal-finance/cryptocurrency/what-is-blockchain-and-what-blockchains-are-242782/\">learn how blockchains work</a>.</p>\r\n\r\n<h2 id=\"tab1\" >Categorizing blockchain implementations</h2>\r\nOne of the most common ways to evaluate blockchains is to consider the underlying <em>data visibility,</em> that is, who can see and access the blockchain data. And just as important, who can participate in the decision (consensus) to add new blocks to the blockchain? The three primary blockchain models are public, private, and hybrid.\r\n<h3>Opening blockchain to everyone</h3>\r\nNakamoto’s original blockchain proposal described a public blockchain. After all, blockchain technology is all about providing trusted transactions among untrusted participants. Sharing a ledger of transactions among nodes in a public network provides a classic untrusted network. If anyone can join the network, you have no criteria on which to base your trust. It’s almost like throwing s $20 bill out your window and trusting that only the person you intend to pick it up will do so.\r\n\r\n<em>Public blockchain</em> implementations, including Bitcoin and Ethereum, depend on a consensus algorithm that makes it hard to mine blocks but easy to validate them. PoW is the most common consensus algorithm in use today for public blockchains, but that may change. Ethereum is in the process of transitioning to the <em>Proof of Stake (PoS) </em>consensus algorithm, which requires less computation and depends on how much blockchain currency a node holds. The idea is that a node with more blockchain currency would be affected negatively if it participates in unethical behavior. The higher the stake you have in something, the greater the chance that you’ll care about its integrity.\r\n\r\nBecause public blockchains are open to anyone (anyone can become a node on the network), no permission is needed to join. For this reason, a public blockchain is also called a <em>permissionless blockchain.</em> Public (permissionless) blockchains are most often used for new apps that interact with the public in general. A public blockchain is like a retail store, in that anyone can walk into the store and shop.\r\n<h3>Limiting blockchain access</h3>\r\nThe opposite of a public blockchain is a private blockchain, such as Hyperledger Fabric. In a <em>private blockchain,</em> also called a <em>permissioned blockchain,</em> the entity that owns and controls the blockchain grants and revokes access to the blockchain data. Because most enterprises manage sensitive or private data, private blockchains are commonly used because they can limit access to that data.\r\n\r\nThe blockchain data is still transparent and readily available but is subject to the owning entity’s access requirements. Some have argued that private blockchains violate data transparency, the original intent of blockchain technology. Although private blockchains can limit data access (and go against the philosophy of the original blockchain in Bitcoin), limited transparency also allows enterprises to consider blockchain technology for new apps in a private environment. Without the private blockchain option, the technology likely would never be considered for most enterprise applications.\r\n<h3>Combining the best of both worlds</h3>\r\nA classic blockchain use case is a supply chain app, which manages a product from its production all the way through its consumption. The beginning of the supply chain is when a product is manufactured, harvested, caught, or otherwise provisioned to send to an eventual customer. The supply chain app then tracks and manages each transfer of ownership as the product makes its way to the physical location where the consumer purchases it.\r\n\r\nSupply chain apps manage product movement, process payment at each stage in the movement lifecycle, and create an audit trail that can be used to investigate the actions of each owner along the supply chain. Blockchain technology is well suited to support the transfer of ownership and maintain an indelible record of each step in the process.\r\n\r\nMany supply chains are complex and consist of multiple organizations. In such cases, data suffers as it is exported from one participant, transmitted to the next participant, and then imported into their data system. A single blockchain would simplify the export/transport/import cycle and auditing. An additional benefit of blockchain technology in supply chain apps is the ease with which a product’s <em>provenance</em> (a trace of owners back to its origin) is readily available.\r\n\r\nMany of today’s supply chains are made up of several enterprises that enter into agreements to work together for mutual benefit. Although the participants in a supply chain are business partners, they do not fully trust one another. A blockchain can provide the level of transactional and data trust that the enterprises need.\r\n\r\nThe best solution is a semi-private blockchain – that is, the blockchain is public for supply chain participants but not to anyone else. This type of blockchain (one that is owned by a group of entities) is called a <em>hybrid, </em>or <em>consortium, blockchain.</em> The participants jointly own the blockchain and agree on policies to govern access.\r\n<h2 id=\"tab2\" >Describing basic blockchain type features</h2>\r\nEach type of blockchain has specific strengths and weaknesses. Which one to use depends on the goals and target environment. You have to know why you need blockchain and what you expect to get from it before you can make an informed decision as to what type of blockchain would be best. The best solution for one organization may not be the best solution for another. The table below shows how blockchain types compare and why you might choose one over the other.\r\n<h3>Differences in Types of Blockchain</h3>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td>Feature</td>\r\n<td>Public</td>\r\n<td>Private</td>\r\n<td>Hybrid</td>\r\n</tr>\r\n<tr>\r\n<td>Permission</td>\r\n<td>Permissionless</td>\r\n<td>Permissioned (limited to organization members)</td>\r\n<td>Permissioned (limited to consortium members)</td>\r\n</tr>\r\n<tr>\r\n<td>Consensus</td>\r\n<td>PoW, PoS, and so on</td>\r\n<td>Authorized participants</td>\r\n<td>Varies; can use any method</td>\r\n</tr>\r\n<tr>\r\n<td>Performance</td>\r\n<td>Slow (due to consensus)</td>\r\n<td>Fast (relatively)</td>\r\n<td>Generally fast</td>\r\n</tr>\r\n<tr>\r\n<td>Identity</td>\r\n<td>Virtually anonymous</td>\r\n<td>Validated identity</td>\r\n<td>Validated identity</td>\r\n</tr>\r\n</tbody>\r\n</table>\r\nThe primary differences between each type of blockchain are the consensus algorithm used and whether participants are known or anonymous. These two concepts are related. An unknown (and therefore completely untrusted) participant will require an environment with a more rigorous consensus algorithm. On the other hand, if you know the transaction participants, you can use a less rigorous consensus algorithm.\r\n<h2 id=\"tab3\" >Contrasting popular enterprise blockchain implementations</h2>\r\nDozens of blockchain implementations are available today, and soon there will be hundreds. Each new blockchain implementation targets a specific market and offers unique features. There isn’t room in this article to cover even a fair number of blockchain implementations, but you should be aware of some of the most popular.\r\n\r\nRemember that you’ll be learning about blockchain analytics in this book. Although organizations of all sizes are starting to leverage the power of analytics, enterprises were early adopters and have the most mature approach to extracting value from data.\r\n<p class=\"article-tips tech\">The What Matrix website provides a comprehensive comparison of top enterprise blockchains. Visit <a href=\"http://www.whatmatrix.com/comparison/Blockchain-for-Enterprise\" target=\"_blank\" rel=\"noopener\">whatmatrix.com</a> for up-to-date blockchain information.</p>\r\nFollowing are the top enterprise blockchain implementations and some of their strengths and weaknesses (ranking is based on the What Matrix website):\r\n<ul>\r\n \t<li><strong>Hyperledger Fabric:</strong> The flagship blockchain implementation from the Linux Foundation. Hyperledger is an open-source project backed by a diverse consortium of large corporations. Hyperledger’s modular-based architecture and rich support make it the highest rated enterprise blockchain.</li>\r\n \t<li><strong>V</strong><strong>eChain:</strong> Currently more popular that Hyperledger, having the highest number of enterprise use cases among products reviewed by What Matrix. VeChain includes support for two native cryptocurrencies and states that its focus is on efficient enterprise collaboration.</li>\r\n \t<li><strong>Ripple Transaction Protocol:</strong> A blockchain that focuses on financial markets. Instead of appealing to general use cases, Ripple caters to organizations that want to implement financial transaction blockchain apps. Ripple was the first commercially available blockchain focused on financial solutions.</li>\r\n \t<li><strong>Ethereum:</strong> The most popular general-purpose, public blockchain implementation. Although <a href=\"https://www.dummies.com/article/business-careers-money/personal-finance/cryptocurrency/what-is-ethereum-263632/\">Ethereum</a> is not technically an enterprise solution, it's in use in multiple proof of concept projects.</li>\r\n</ul>\r\nThe preceding list is just a brief overview of a small sample of blockchain implementations. If you’re just beginning to learn about blockchain technology in general, start out with Ethereum, which is one of the easier blockchain implementations to learn. After that, you can progress to another blockchain that may be better aligned with your organization.\r\n\r\nWant to learn more? Check out our <a href=\"https://www.dummies.com/article/technology/information-technology/data-science/general-data-science/blockchain-data-analytics-for-dummies-cheat-sheet-273253/\">Blockchain Data Analytic Cheat Sheet.</a>","description":"In 2008, Bitcoin was the only <a href=\"https://www.dummies.com/article/business-careers-money/personal-finance/cryptocurrency/brief-history-bitcoin-blockchain-242874/\">blockchain implementation</a>. At that time, Bitcoin and blockchain were synonymous. Now hundreds of different blockchain implementations exist. Each new blockchain implementation emerges to address a particular need and each one is unique. However, blockchains tend to share many features with other blockchains. Before examining blockchain applications and data, it helps to look at their similarities.\r\n\r\n[caption id=\"attachment_275189\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275189 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockcahin-data-analytics-types.png\" alt=\"blockchain\" width=\"556\" height=\"324\" /> ©Shutterstock/phive[/caption]\r\n<p class=\"article-tips tip\">Check out this article to <a href=\"https://www.dummies.com/article/business-careers-money/personal-finance/cryptocurrency/what-is-blockchain-and-what-blockchains-are-242782/\">learn how blockchains work</a>.</p>\r\n\r\n<h2 id=\"tab1\" >Categorizing blockchain implementations</h2>\r\nOne of the most common ways to evaluate blockchains is to consider the underlying <em>data visibility,</em> that is, who can see and access the blockchain data. And just as important, who can participate in the decision (consensus) to add new blocks to the blockchain? The three primary blockchain models are public, private, and hybrid.\r\n<h3>Opening blockchain to everyone</h3>\r\nNakamoto’s original blockchain proposal described a public blockchain. After all, blockchain technology is all about providing trusted transactions among untrusted participants. Sharing a ledger of transactions among nodes in a public network provides a classic untrusted network. If anyone can join the network, you have no criteria on which to base your trust. It’s almost like throwing s $20 bill out your window and trusting that only the person you intend to pick it up will do so.\r\n\r\n<em>Public blockchain</em> implementations, including Bitcoin and Ethereum, depend on a consensus algorithm that makes it hard to mine blocks but easy to validate them. PoW is the most common consensus algorithm in use today for public blockchains, but that may change. Ethereum is in the process of transitioning to the <em>Proof of Stake (PoS) </em>consensus algorithm, which requires less computation and depends on how much blockchain currency a node holds. The idea is that a node with more blockchain currency would be affected negatively if it participates in unethical behavior. The higher the stake you have in something, the greater the chance that you’ll care about its integrity.\r\n\r\nBecause public blockchains are open to anyone (anyone can become a node on the network), no permission is needed to join. For this reason, a public blockchain is also called a <em>permissionless blockchain.</em> Public (permissionless) blockchains are most often used for new apps that interact with the public in general. A public blockchain is like a retail store, in that anyone can walk into the store and shop.\r\n<h3>Limiting blockchain access</h3>\r\nThe opposite of a public blockchain is a private blockchain, such as Hyperledger Fabric. In a <em>private blockchain,</em> also called a <em>permissioned blockchain,</em> the entity that owns and controls the blockchain grants and revokes access to the blockchain data. Because most enterprises manage sensitive or private data, private blockchains are commonly used because they can limit access to that data.\r\n\r\nThe blockchain data is still transparent and readily available but is subject to the owning entity’s access requirements. Some have argued that private blockchains violate data transparency, the original intent of blockchain technology. Although private blockchains can limit data access (and go against the philosophy of the original blockchain in Bitcoin), limited transparency also allows enterprises to consider blockchain technology for new apps in a private environment. Without the private blockchain option, the technology likely would never be considered for most enterprise applications.\r\n<h3>Combining the best of both worlds</h3>\r\nA classic blockchain use case is a supply chain app, which manages a product from its production all the way through its consumption. The beginning of the supply chain is when a product is manufactured, harvested, caught, or otherwise provisioned to send to an eventual customer. The supply chain app then tracks and manages each transfer of ownership as the product makes its way to the physical location where the consumer purchases it.\r\n\r\nSupply chain apps manage product movement, process payment at each stage in the movement lifecycle, and create an audit trail that can be used to investigate the actions of each owner along the supply chain. Blockchain technology is well suited to support the transfer of ownership and maintain an indelible record of each step in the process.\r\n\r\nMany supply chains are complex and consist of multiple organizations. In such cases, data suffers as it is exported from one participant, transmitted to the next participant, and then imported into their data system. A single blockchain would simplify the export/transport/import cycle and auditing. An additional benefit of blockchain technology in supply chain apps is the ease with which a product’s <em>provenance</em> (a trace of owners back to its origin) is readily available.\r\n\r\nMany of today’s supply chains are made up of several enterprises that enter into agreements to work together for mutual benefit. Although the participants in a supply chain are business partners, they do not fully trust one another. A blockchain can provide the level of transactional and data trust that the enterprises need.\r\n\r\nThe best solution is a semi-private blockchain – that is, the blockchain is public for supply chain participants but not to anyone else. This type of blockchain (one that is owned by a group of entities) is called a <em>hybrid, </em>or <em>consortium, blockchain.</em> The participants jointly own the blockchain and agree on policies to govern access.\r\n<h2 id=\"tab2\" >Describing basic blockchain type features</h2>\r\nEach type of blockchain has specific strengths and weaknesses. Which one to use depends on the goals and target environment. You have to know why you need blockchain and what you expect to get from it before you can make an informed decision as to what type of blockchain would be best. The best solution for one organization may not be the best solution for another. The table below shows how blockchain types compare and why you might choose one over the other.\r\n<h3>Differences in Types of Blockchain</h3>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td>Feature</td>\r\n<td>Public</td>\r\n<td>Private</td>\r\n<td>Hybrid</td>\r\n</tr>\r\n<tr>\r\n<td>Permission</td>\r\n<td>Permissionless</td>\r\n<td>Permissioned (limited to organization members)</td>\r\n<td>Permissioned (limited to consortium members)</td>\r\n</tr>\r\n<tr>\r\n<td>Consensus</td>\r\n<td>PoW, PoS, and so on</td>\r\n<td>Authorized participants</td>\r\n<td>Varies; can use any method</td>\r\n</tr>\r\n<tr>\r\n<td>Performance</td>\r\n<td>Slow (due to consensus)</td>\r\n<td>Fast (relatively)</td>\r\n<td>Generally fast</td>\r\n</tr>\r\n<tr>\r\n<td>Identity</td>\r\n<td>Virtually anonymous</td>\r\n<td>Validated identity</td>\r\n<td>Validated identity</td>\r\n</tr>\r\n</tbody>\r\n</table>\r\nThe primary differences between each type of blockchain are the consensus algorithm used and whether participants are known or anonymous. These two concepts are related. An unknown (and therefore completely untrusted) participant will require an environment with a more rigorous consensus algorithm. On the other hand, if you know the transaction participants, you can use a less rigorous consensus algorithm.\r\n<h2 id=\"tab3\" >Contrasting popular enterprise blockchain implementations</h2>\r\nDozens of blockchain implementations are available today, and soon there will be hundreds. Each new blockchain implementation targets a specific market and offers unique features. There isn’t room in this article to cover even a fair number of blockchain implementations, but you should be aware of some of the most popular.\r\n\r\nRemember that you’ll be learning about blockchain analytics in this book. Although organizations of all sizes are starting to leverage the power of analytics, enterprises were early adopters and have the most mature approach to extracting value from data.\r\n<p class=\"article-tips tech\">The What Matrix website provides a comprehensive comparison of top enterprise blockchains. Visit <a href=\"http://www.whatmatrix.com/comparison/Blockchain-for-Enterprise\" target=\"_blank\" rel=\"noopener\">whatmatrix.com</a> for up-to-date blockchain information.</p>\r\nFollowing are the top enterprise blockchain implementations and some of their strengths and weaknesses (ranking is based on the What Matrix website):\r\n<ul>\r\n \t<li><strong>Hyperledger Fabric:</strong> The flagship blockchain implementation from the Linux Foundation. Hyperledger is an open-source project backed by a diverse consortium of large corporations. Hyperledger’s modular-based architecture and rich support make it the highest rated enterprise blockchain.</li>\r\n \t<li><strong>V</strong><strong>eChain:</strong> Currently more popular that Hyperledger, having the highest number of enterprise use cases among products reviewed by What Matrix. VeChain includes support for two native cryptocurrencies and states that its focus is on efficient enterprise collaboration.</li>\r\n \t<li><strong>Ripple Transaction Protocol:</strong> A blockchain that focuses on financial markets. Instead of appealing to general use cases, Ripple caters to organizations that want to implement financial transaction blockchain apps. Ripple was the first commercially available blockchain focused on financial solutions.</li>\r\n \t<li><strong>Ethereum:</strong> The most popular general-purpose, public blockchain implementation. Although <a href=\"https://www.dummies.com/article/business-careers-money/personal-finance/cryptocurrency/what-is-ethereum-263632/\">Ethereum</a> is not technically an enterprise solution, it's in use in multiple proof of concept projects.</li>\r\n</ul>\r\nThe preceding list is just a brief overview of a small sample of blockchain implementations. If you’re just beginning to learn about blockchain technology in general, start out with Ethereum, which is one of the easier blockchain implementations to learn. After that, you can progress to another blockchain that may be better aligned with your organization.\r\n\r\nWant to learn more? Check out our <a href=\"https://www.dummies.com/article/technology/information-technology/data-science/general-data-science/blockchain-data-analytics-for-dummies-cheat-sheet-273253/\">Blockchain Data Analytic Cheat Sheet.</a>","blurb":"","authors":[{"authorId":27199,"name":"Michael G. Solomon","slug":"michael-solomon","description":"Michael G. Solomon, PhD, is a professor at the University of the Cumberlands who specializes in courses on blockchain and distributed computing systems as well as computer security. He holds numerous security and project management certifications and has written several books on security and project management, including Ethereum For Dummies. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/27199"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General Data Science","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":[{"articleId":192609,"title":"How to Pray the Rosary: A Comprehensive Guide","slug":"how-to-pray-the-rosary","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/192609"}},{"articleId":208741,"title":"Kabbalah For Dummies Cheat Sheet","slug":"kabbalah-for-dummies-cheat-sheet","categoryList":["body-mind-spirit","religion-spirituality","kabbalah"],"_links":{"self":"/articles/208741"}},{"articleId":230957,"title":"Nikon D3400 For Dummies Cheat Sheet","slug":"nikon-d3400-dummies-cheat-sheet","categoryList":["home-auto-hobbies","photography"],"_links":{"self":"/articles/230957"}},{"articleId":235851,"title":"Praying the Rosary and Meditating on the Mysteries","slug":"praying-rosary-meditating-mysteries","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/235851"}},{"articleId":284787,"title":"What Your Society Says About You","slug":"what-your-society-says-about-you","categoryList":["academics-the-arts","humanities"],"_links":{"self":"/articles/284787"}}],"inThisArticle":[{"label":"Categorizing blockchain implementations","target":"#tab1"},{"label":"Describing basic blockchain type features","target":"#tab2"},{"label":"Contrasting popular enterprise blockchain implementations","target":"#tab3"}],"relatedArticles":{"fromBook":[],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":0,"slug":null,"isbn":null,"categoryList":null,"amazon":null,"image":null,"title":null,"testBankPinActivationLink":null,"bookOutOfPrint":false,"authorsInfo":null,"authors":null,"_links":null},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[null]}]\" id=\"du-slot-64be922f4db47\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[null]}]\" id=\"du-slot-64be922f4e319\"></div></div>"},"articleType":{"articleType":"Articles","articleList":null,"content":null,"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2021-03-18T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":275188},{"headers":{"creationTime":"2016-03-26T08:09:33+00:00","modifiedTime":"2023-06-09T14:03:51+00:00","timestamp":"2023-06-09T15:01:03+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"Looking at the Basics of Statistics, Machine Learning, and Mathematical Methods in Data Science","strippedTitle":"looking at the basics of statistics, machine learning, and mathematical methods in data science","slug":"looking-at-the-basics-of-statistics-machine-learning-and-mathematical-methods-in-data-science","canonicalUrl":"","seo":{"metaDescription":"If statistics has been described as the science of deriving insights from data, then what’s the difference between a statistician and a data scientist? Good que","noIndex":0,"noFollow":0},"content":"If statistics has been described as the science of deriving insights from data, then what’s the difference between a statistician and a data scientist? Good question! While many tasks in data science require a fair bit of statistical know how, the scope and breadth of a data scientist’s knowledge and skill base is distinct from those of a statistician. The core distinctions are outlined below.\r\n<ul class=\"level-one\">\r\n\t<li>\r\n<p class=\"first-para\"><b>Subject matter expertise:</b> One of the core features of data scientists is that they offer a sophisticated degree of expertise in the area to which they apply their analytical methods. Data scientists need this so that they’re able to truly understand the implications and applications of the data insights they generate. A data scientist should have enough subject matter expertise to be able to identify the significance of their findings and independently decide how to proceed in the analysis.</p>\r\n<p class=\"child-para\"><b></b>In contrast, statisticians usually have an incredibly deep knowledge of statistics, but very little expertise in the subject matters to which they apply statistical methods. Most of the time, statisticians are required to consult with external subject matter experts to truly get a firm grasp on the significance of their findings, and to be able to decide the best way to move forward in an analysis.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Mathematical and machine learning approaches:</b> Statisticians rely mostly on statistical methods and processes when deriving insights from data. In contrast, data scientists are required to pull from a wide variety of techniques to derive data insights. These include statistical methods, but also include approaches that are not based in statistics — like those found in mathematics, clustering, classification, and non-statistical machine learning approaches.</p>\r\n</li>\r\n</ul>\r\n<h2 id=\"tab1\" >Seeing the importance of statistical know-how</h2>\r\nYou don't need to go out and get a degree in statistics to practice data science, but you should at least get familiar with some of the more fundamental methods that are used in statistical data analysis. These include:\r\n<ul class=\"level-one\">\r\n\t<li>\r\n<p class=\"first-para\"><b>Linear regression</b>:<b> </b>Linear regression is useful for modeling the relationships between a dependent variable and one or several independent variables. The purpose of linear regression is to discover (and quantify the strength of) important correlations between dependent and independent variables.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Time-series analysis:</b> Time series analysis involves analyzing a collection of data on attribute values over time, in order to predict future instances of the measure based on the past observational data.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Monte Carlo simulations:</b> The Monte Carlo method is a simulation technique you can use to test hypotheses, to generate parameter estimates, to predict scenario outcomes, and to validate models. The method is powerful because it can be used to very quickly simulate anywhere from 1 to 10,000 (or more) simulation samples for any processes you are trying to evaluate.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Statistics for spatial data:</b> One fundamental and important property of spatial data is that it’s not random. It’s spatially dependent and autocorrelated. When modeling spatial data, avoid statistical methods that assume your data is random. Kriging and krige are two statistical methods that you can use to model spatial data. These methods enable you to produce predictive surfaces for entire study areas based on sets of known points in geographic space.</p>\r\n</li>\r\n</ul>\r\n<h2 id=\"tab2\" >Working with clustering, classification, and machine learning methods</h2>\r\nMachine learning is the application of computational algorithms to learn from (or deduce patterns in) raw datasets. <em>Clustering</em> is a particular type of machine learning —<em>unsupervised</em> machine learning, to be precise, meaning that the algorithms must learn from unlabeled data, and as such, they must use inferential methods to discover correlations.\r\n\r\n<i>Classification</i>, on the other hand, is called supervised machine learning, meaning that the algorithms learn from labeled data. The following descriptions introduce some of the more basic clustering and classification approaches:\r\n<ul class=\"level-one\">\r\n\t<li>\r\n<p class=\"first-para\"><b>k-means clustering:</b> You generally deploy k-means algorithms to subdivide data points of a dataset into clusters based on nearest mean values. To determine the optimal division of your data points into clusters, such that the distance between points in each cluster is minimized, you can use k-means clustering.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Nearest neighbor algorithms:</b> The purpose of a nearest neighbor analysis is to search for and locate either a nearest point in space or a nearest numerical value, depending on the attribute you use for the basis of comparison.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Kernel density estimation:</b> An alternative way to identify clusters in your data is to use a density smoothing function. Kernel density estimation (KDE) works by placing a <i>kernel</i> a weighting function that is useful for quantifying density — on each data point in the data set, and then summing the kernels to generate a kernel density estimate for the overall region.</p>\r\n</li>\r\n</ul>\r\n<h2 id=\"tab3\" >Keeping mathematical methods in the mix</h2>\r\nLots gets said about the value of statistics in the practice of data science, but applied mathematical methods are seldom mentioned. To be frank, mathematics is the basis of all quantitative analyses. Its importance should not be understated. The two following mathematical methods are particularly useful in data science.\r\n<ul class=\"level-one\">\r\n\t<li>\r\n<p class=\"first-para\"><b>Multi-criteria decision making (MCDM): </b>MCDM is a<b> </b>mathematical decision modeling approach that you can use when you have several criteria or alternatives that you must simultaneously evaluate when making a decision.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Markov chains</b>: A Markov chain is a mathematical method that chains together a series of randomly generated variables that represent the present state in order to model how changes in present state variables affect future states.</p>\r\n</li>\r\n</ul>","description":"If statistics has been described as the science of deriving insights from data, then what’s the difference between a statistician and a data scientist? Good question! While many tasks in data science require a fair bit of statistical know how, the scope and breadth of a data scientist’s knowledge and skill base is distinct from those of a statistician. The core distinctions are outlined below.\r\n<ul class=\"level-one\">\r\n\t<li>\r\n<p class=\"first-para\"><b>Subject matter expertise:</b> One of the core features of data scientists is that they offer a sophisticated degree of expertise in the area to which they apply their analytical methods. Data scientists need this so that they’re able to truly understand the implications and applications of the data insights they generate. A data scientist should have enough subject matter expertise to be able to identify the significance of their findings and independently decide how to proceed in the analysis.</p>\r\n<p class=\"child-para\"><b></b>In contrast, statisticians usually have an incredibly deep knowledge of statistics, but very little expertise in the subject matters to which they apply statistical methods. Most of the time, statisticians are required to consult with external subject matter experts to truly get a firm grasp on the significance of their findings, and to be able to decide the best way to move forward in an analysis.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Mathematical and machine learning approaches:</b> Statisticians rely mostly on statistical methods and processes when deriving insights from data. In contrast, data scientists are required to pull from a wide variety of techniques to derive data insights. These include statistical methods, but also include approaches that are not based in statistics — like those found in mathematics, clustering, classification, and non-statistical machine learning approaches.</p>\r\n</li>\r\n</ul>\r\n<h2 id=\"tab1\" >Seeing the importance of statistical know-how</h2>\r\nYou don't need to go out and get a degree in statistics to practice data science, but you should at least get familiar with some of the more fundamental methods that are used in statistical data analysis. These include:\r\n<ul class=\"level-one\">\r\n\t<li>\r\n<p class=\"first-para\"><b>Linear regression</b>:<b> </b>Linear regression is useful for modeling the relationships between a dependent variable and one or several independent variables. The purpose of linear regression is to discover (and quantify the strength of) important correlations between dependent and independent variables.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Time-series analysis:</b> Time series analysis involves analyzing a collection of data on attribute values over time, in order to predict future instances of the measure based on the past observational data.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Monte Carlo simulations:</b> The Monte Carlo method is a simulation technique you can use to test hypotheses, to generate parameter estimates, to predict scenario outcomes, and to validate models. The method is powerful because it can be used to very quickly simulate anywhere from 1 to 10,000 (or more) simulation samples for any processes you are trying to evaluate.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Statistics for spatial data:</b> One fundamental and important property of spatial data is that it’s not random. It’s spatially dependent and autocorrelated. When modeling spatial data, avoid statistical methods that assume your data is random. Kriging and krige are two statistical methods that you can use to model spatial data. These methods enable you to produce predictive surfaces for entire study areas based on sets of known points in geographic space.</p>\r\n</li>\r\n</ul>\r\n<h2 id=\"tab2\" >Working with clustering, classification, and machine learning methods</h2>\r\nMachine learning is the application of computational algorithms to learn from (or deduce patterns in) raw datasets. <em>Clustering</em> is a particular type of machine learning —<em>unsupervised</em> machine learning, to be precise, meaning that the algorithms must learn from unlabeled data, and as such, they must use inferential methods to discover correlations.\r\n\r\n<i>Classification</i>, on the other hand, is called supervised machine learning, meaning that the algorithms learn from labeled data. The following descriptions introduce some of the more basic clustering and classification approaches:\r\n<ul class=\"level-one\">\r\n\t<li>\r\n<p class=\"first-para\"><b>k-means clustering:</b> You generally deploy k-means algorithms to subdivide data points of a dataset into clusters based on nearest mean values. To determine the optimal division of your data points into clusters, such that the distance between points in each cluster is minimized, you can use k-means clustering.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Nearest neighbor algorithms:</b> The purpose of a nearest neighbor analysis is to search for and locate either a nearest point in space or a nearest numerical value, depending on the attribute you use for the basis of comparison.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Kernel density estimation:</b> An alternative way to identify clusters in your data is to use a density smoothing function. Kernel density estimation (KDE) works by placing a <i>kernel</i> a weighting function that is useful for quantifying density — on each data point in the data set, and then summing the kernels to generate a kernel density estimate for the overall region.</p>\r\n</li>\r\n</ul>\r\n<h2 id=\"tab3\" >Keeping mathematical methods in the mix</h2>\r\nLots gets said about the value of statistics in the practice of data science, but applied mathematical methods are seldom mentioned. To be frank, mathematics is the basis of all quantitative analyses. Its importance should not be understated. The two following mathematical methods are particularly useful in data science.\r\n<ul class=\"level-one\">\r\n\t<li>\r\n<p class=\"first-para\"><b>Multi-criteria decision making (MCDM): </b>MCDM is a<b> </b>mathematical decision modeling approach that you can use when you have several criteria or alternatives that you must simultaneously evaluate when making a decision.</p>\r\n</li>\r\n\t<li>\r\n<p class=\"first-para\"><b>Markov chains</b>: A Markov chain is a mathematical method that chains together a series of randomly generated variables that represent the present state in order to model how changes in present state variables affect future states.</p>\r\n</li>\r\n</ul>","blurb":"","authors":[{"authorId":9232,"name":"Lillian Pierson","slug":"lillian-pierson","description":" <p><b>Lillian Pierson</b> is the CEO of Data-Mania, where she supports data professionals in transforming into world-class leaders and entrepreneurs. She has trained well over one million individuals on the topics of AI and data science. Lillian has assisted global leaders in IT, government, media organizations, and nonprofits.</p> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9232"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General Data Science","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":[{"articleId":192609,"title":"How to Pray the Rosary: A Comprehensive Guide","slug":"how-to-pray-the-rosary","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/192609"}},{"articleId":208741,"title":"Kabbalah For Dummies Cheat Sheet","slug":"kabbalah-for-dummies-cheat-sheet","categoryList":["body-mind-spirit","religion-spirituality","kabbalah"],"_links":{"self":"/articles/208741"}},{"articleId":230957,"title":"Nikon D3400 For Dummies Cheat Sheet","slug":"nikon-d3400-dummies-cheat-sheet","categoryList":["home-auto-hobbies","photography"],"_links":{"self":"/articles/230957"}},{"articleId":235851,"title":"Praying the Rosary and Meditating on the Mysteries","slug":"praying-rosary-meditating-mysteries","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/235851"}},{"articleId":284787,"title":"What Your Society Says About You","slug":"what-your-society-says-about-you","categoryList":["academics-the-arts","humanities"],"_links":{"self":"/articles/284787"}}],"inThisArticle":[{"label":"Seeing the importance of statistical know-how","target":"#tab1"},{"label":"Working with clustering, classification, and machine learning methods","target":"#tab2"},{"label":"Keeping mathematical methods in the mix","target":"#tab3"}],"relatedArticles":{"fromBook":[{"articleId":238086,"title":"Data Journalism: Collecting Data for Your Story","slug":"data-journalism-collecting-data-story","categoryList":["technology","computers","macs","general-macs"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238086"}},{"articleId":238083,"title":"Data Journalism: How to Develop, Tell, and Present the Story","slug":"data-journalism-develop-tell-present-story","categoryList":["technology","computers","macs","general-macs"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238083"}},{"articleId":238080,"title":"Data Journalism: Why the Story Matters","slug":"data-journalism-story-matters","categoryList":["technology","computers","macs","general-macs"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238080"}},{"articleId":238077,"title":"The Where in Data Journalism","slug":"the-where-in-data-journalism","categoryList":["technology","computers","macs","general-macs"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238077"}},{"articleId":238074,"title":"The When in Data Journalism","slug":"the-when-in-data-journalism","categoryList":["technology","computers","macs","general-macs"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/238074"}}],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281677,"slug":"data-science-for-dummies","isbn":"9781119811558","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119811554/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119811554/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119811554-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119811554/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119811554/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/data-science-for-dummies-3rd-edition-cover-9781119811558-203x255.jpg","width":203,"height":255},"title":"Data Science For Dummies","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"<p><b><b data-author-id=\"9232\">Lillian Pierson</b></b> is the CEO of Data-Mania, where she supports data professionals in transforming into world-class leaders and entrepreneurs. She has trained well over one million individuals on the topics of AI and data science. Lillian has assisted global leaders in IT, government, media organizations, and nonprofits.</p>","authors":[{"authorId":9232,"name":"Lillian Pierson","slug":"lillian-pierson","description":" <p><b>Lillian Pierson</b> is the CEO of Data-Mania, where she supports data professionals in transforming into world-class leaders and entrepreneurs. She has trained well over one million individuals on the topics of AI and data science. Lillian has assisted global leaders in IT, government, media organizations, and nonprofits.</p> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9232"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119811558&quot;]}]\" id=\"du-slot-64833eaf8e63a\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119811558&quot;]}]\" id=\"du-slot-64833eaf8ef08\"></div></div>"},"articleType":{"articleType":"Articles","articleList":null,"content":null,"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"Two years","lifeExpectancySetFrom":"2023-06-09T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":145318},{"headers":{"creationTime":"2020-12-30T16:10:44+00:00","modifiedTime":"2023-06-09T13:45:14+00:00","timestamp":"2023-06-09T15:01:03+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"An Intro to Aligning Blockchain Data Analytics with Business Goals","strippedTitle":"an intro to aligning blockchain data analytics with business goals","slug":"an-intro-to-aligning-blockchain-data-analytics-with-business-goals","canonicalUrl":"","seo":{"metaDescription":"Want to learn how to align your blockchain data analysis with your business goals? Use this brief intro to discover the best approach.","noIndex":0,"noFollow":0},"content":"Blockchain technology alone cannot provide rich analytics results. For all that blockchain is, it can’t magically provide more data than other technologies. Before selecting blockchain technology for any new development or analytics project, clearly justify why such a decision makes sense.\r\n\r\n[caption id=\"attachment_275230\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275230 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-business-goals.png\" alt=\"Blockchain and business goals\" width=\"556\" height=\"371\" /> ©Shutterstock/NicoElNino[/caption]\r\n\r\nIf you already depend on blockchain technology to store data, the decision to use that data for analysis is a lot easier to justify. Here, you examine some reasons why blockchain-supported analytics may allow you to leverage your data in interesting ways.\r\n<h2 id=\"tab1\" >Leveraging newly accessible decentralized tools to analyze blockchain data</h2>\r\nYou’ll want to learn how to manually access and <a href=\"https://www.dummies.com/article/technology/information-technology/data-science/general-data-science/10-uses-for-blockchain-analytics-275153/\">analyze blockchain data</a>. But, it's also important to understand how to exercise granular control over your data throughout the analytics process, higher-level tools make the task easier. The growing number of decentralized data analytics solutions means more opportunities to build analytics models with less effort. Third-party tools may reduce the amount of control you have over the models you deploy, but they can dramatically increase analytics productivity.\r\n\r\nThe following list of blockchain analytics solutions is not exhaustive and is likely to change rapidly. Take a few minutes to conduct your own internet search for blockchain analytics tools. You’ll likely find even more software and services:\r\n<ul>\r\n \t<li><strong>Endor: </strong>A blockchain-based AI prediction platform that has the goal of making the technology accessible to organizations of all sizes. Endor is both a blockchain analytics protocol and a prediction engine that integrates on-chain and off-chain data for analysis.</li>\r\n \t<li><strong>Crystal:</strong> A blockchain analytics platform that integrates with the Bitcoin and Ethereum blockchains and focuses on cryptocurrency transaction analytics. Different Crystal products cater to small organizations, enterprises, and law enforcement agencies.</li>\r\n \t<li><strong>OXT:</strong> The most focused of the three products listed, OXT is an analytics and visualization explorer tool for the Bitcoin blockchain. Although OXT doesn’t provide analytics support for a variety of blockchains, it attempts to provide a wide range of analytics options for Bitcoin.</li>\r\n</ul>\r\n<h2 id=\"tab2\" >Monetizing blockchain data</h2>\r\nToday’s economy is driven by data, and the amount of data being collected about individuals and their behavior is staggering. Think of the last time you accessed your favorite shopping site. Chances are, you saw an ad that you found relevant. Those targeted ads seem to be getting better and better at figuring out what would interest you. The capability to align ads with user preferences depends on an analytics engine acquiring enough data about the user to reliably predict products or services of interest.\r\n\r\nBlockchain data can represent the next logical phase of data’s value to the enterprise. As more and more consumers realize the value of their personal data, interest is growing in the capability to control that data. Consumers now want to control how their data is being used and demand incentives or compensation for the use of their data.\r\n\r\nBlockchain technology can provide a central point of presence for personal data and the ability for the data’s owner to authorize access to that data. Removing personal data from common central data stores, such as Google and Facebook, has the potential to revolutionize marketing and advertising. Smaller organizations could access valuable marketing information by asking permission from the data owner as opposed to the large data aggregators. Circumventing big players such as Google and Facebook could reduce marketing costs and allow incentives to flow directly to individuals.\r\n\r\nThere is a long way to go to move away from current personal data usage practices, but blockchain technology makes it possible. This process may be accelerated by emerging regulations that protect individual rights to control private data. For example, the European Union’s <a href=\"https://www.dummies.com/article/technology/cybersecurity/general-data-protections-regulation-gdpr-239606/\">General Data Protection Regulation (GDPR)</a> and the California Consumer Privacy Act (CCPA) both strengthen an individual’s ability to control access to, and use of, their personal data.\r\n<h2 id=\"tab3\" >Exchanging and integrating blockchain data effectively</h2>\r\nMuch of the value of blockchain data is in its capability to relate to off-chain data. Most blockchain apps refer to some data stored in off-chain repositories. It doesn’t make sense to store every type of data in a blockchain. Reference data, which is commonly data that gets updated to reflect changing conditions, may not be good candidates for storing in a blockchain.\r\n<p class=\"article-tips remember\"><a href=\"https://www.dummies.com/article/business-careers-money/personal-finance/cryptocurrency/what-is-blockchain-and-what-blockchains-are-242782/\">Blockchain technology</a> excels at recording value transfers between owners. All applications define and maintain additional information that supports and provides details for transactions but doesn’t directly participate in transactions. Such information, such as product description or customer notes, may make more sense to store in an off-chain repository.</p>\r\nAny time blockchain apps rely on on-chain and off-chain data, integration methods become a concern. Even if your app uses only on-chain data, it is likely that analytics models will integrate with off-chain data. For example, owners in blockchain environments are identified by addresses. These addresses have no context external to the blockchain. Any association between an address and a real-world identity is likely stored in an off-chain repository.\r\n\r\nAnother example of the need for off-chain data is when analyzing aircraft safety trends. Perhaps your analysis correlates blockchain-based incident and accident data with weather conditions. Although each blockchain transaction contains a timestamp, you’d have to consult an external weather database to determine prevailing weather conditions at the time of the transaction.\r\n\r\nMany examples of the need to integrate off-chain data with on-chain transactions exist. Part of the data acquisition phase of any analytics project is to identify data sources and access methods. In a blockchain analytics project, that process means identifying off-chain data you need to satisfy the goals of your project and how to get that data.\r\n\r\nWant to learn more? Check out our <a href=\"https://www.dummies.com/article/technology/information-technology/data-science/general-data-science/blockchain-data-analytics-for-dummies-cheat-sheet-273253/\">Blockchain Data Analytics Cheat Sheet</a>.","description":"Blockchain technology alone cannot provide rich analytics results. For all that blockchain is, it can’t magically provide more data than other technologies. Before selecting blockchain technology for any new development or analytics project, clearly justify why such a decision makes sense.\r\n\r\n[caption id=\"attachment_275230\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275230 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-business-goals.png\" alt=\"Blockchain and business goals\" width=\"556\" height=\"371\" /> ©Shutterstock/NicoElNino[/caption]\r\n\r\nIf you already depend on blockchain technology to store data, the decision to use that data for analysis is a lot easier to justify. Here, you examine some reasons why blockchain-supported analytics may allow you to leverage your data in interesting ways.\r\n<h2 id=\"tab1\" >Leveraging newly accessible decentralized tools to analyze blockchain data</h2>\r\nYou’ll want to learn how to manually access and <a href=\"https://www.dummies.com/article/technology/information-technology/data-science/general-data-science/10-uses-for-blockchain-analytics-275153/\">analyze blockchain data</a>. But, it's also important to understand how to exercise granular control over your data throughout the analytics process, higher-level tools make the task easier. The growing number of decentralized data analytics solutions means more opportunities to build analytics models with less effort. Third-party tools may reduce the amount of control you have over the models you deploy, but they can dramatically increase analytics productivity.\r\n\r\nThe following list of blockchain analytics solutions is not exhaustive and is likely to change rapidly. Take a few minutes to conduct your own internet search for blockchain analytics tools. You’ll likely find even more software and services:\r\n<ul>\r\n \t<li><strong>Endor: </strong>A blockchain-based AI prediction platform that has the goal of making the technology accessible to organizations of all sizes. Endor is both a blockchain analytics protocol and a prediction engine that integrates on-chain and off-chain data for analysis.</li>\r\n \t<li><strong>Crystal:</strong> A blockchain analytics platform that integrates with the Bitcoin and Ethereum blockchains and focuses on cryptocurrency transaction analytics. Different Crystal products cater to small organizations, enterprises, and law enforcement agencies.</li>\r\n \t<li><strong>OXT:</strong> The most focused of the three products listed, OXT is an analytics and visualization explorer tool for the Bitcoin blockchain. Although OXT doesn’t provide analytics support for a variety of blockchains, it attempts to provide a wide range of analytics options for Bitcoin.</li>\r\n</ul>\r\n<h2 id=\"tab2\" >Monetizing blockchain data</h2>\r\nToday’s economy is driven by data, and the amount of data being collected about individuals and their behavior is staggering. Think of the last time you accessed your favorite shopping site. Chances are, you saw an ad that you found relevant. Those targeted ads seem to be getting better and better at figuring out what would interest you. The capability to align ads with user preferences depends on an analytics engine acquiring enough data about the user to reliably predict products or services of interest.\r\n\r\nBlockchain data can represent the next logical phase of data’s value to the enterprise. As more and more consumers realize the value of their personal data, interest is growing in the capability to control that data. Consumers now want to control how their data is being used and demand incentives or compensation for the use of their data.\r\n\r\nBlockchain technology can provide a central point of presence for personal data and the ability for the data’s owner to authorize access to that data. Removing personal data from common central data stores, such as Google and Facebook, has the potential to revolutionize marketing and advertising. Smaller organizations could access valuable marketing information by asking permission from the data owner as opposed to the large data aggregators. Circumventing big players such as Google and Facebook could reduce marketing costs and allow incentives to flow directly to individuals.\r\n\r\nThere is a long way to go to move away from current personal data usage practices, but blockchain technology makes it possible. This process may be accelerated by emerging regulations that protect individual rights to control private data. For example, the European Union’s <a href=\"https://www.dummies.com/article/technology/cybersecurity/general-data-protections-regulation-gdpr-239606/\">General Data Protection Regulation (GDPR)</a> and the California Consumer Privacy Act (CCPA) both strengthen an individual’s ability to control access to, and use of, their personal data.\r\n<h2 id=\"tab3\" >Exchanging and integrating blockchain data effectively</h2>\r\nMuch of the value of blockchain data is in its capability to relate to off-chain data. Most blockchain apps refer to some data stored in off-chain repositories. It doesn’t make sense to store every type of data in a blockchain. Reference data, which is commonly data that gets updated to reflect changing conditions, may not be good candidates for storing in a blockchain.\r\n<p class=\"article-tips remember\"><a href=\"https://www.dummies.com/article/business-careers-money/personal-finance/cryptocurrency/what-is-blockchain-and-what-blockchains-are-242782/\">Blockchain technology</a> excels at recording value transfers between owners. All applications define and maintain additional information that supports and provides details for transactions but doesn’t directly participate in transactions. Such information, such as product description or customer notes, may make more sense to store in an off-chain repository.</p>\r\nAny time blockchain apps rely on on-chain and off-chain data, integration methods become a concern. Even if your app uses only on-chain data, it is likely that analytics models will integrate with off-chain data. For example, owners in blockchain environments are identified by addresses. These addresses have no context external to the blockchain. Any association between an address and a real-world identity is likely stored in an off-chain repository.\r\n\r\nAnother example of the need for off-chain data is when analyzing aircraft safety trends. Perhaps your analysis correlates blockchain-based incident and accident data with weather conditions. Although each blockchain transaction contains a timestamp, you’d have to consult an external weather database to determine prevailing weather conditions at the time of the transaction.\r\n\r\nMany examples of the need to integrate off-chain data with on-chain transactions exist. Part of the data acquisition phase of any analytics project is to identify data sources and access methods. In a blockchain analytics project, that process means identifying off-chain data you need to satisfy the goals of your project and how to get that data.\r\n\r\nWant to learn more? Check out our <a href=\"https://www.dummies.com/article/technology/information-technology/data-science/general-data-science/blockchain-data-analytics-for-dummies-cheat-sheet-273253/\">Blockchain Data Analytics Cheat Sheet</a>.","blurb":"","authors":[{"authorId":27199,"name":"Michael G. Solomon","slug":"michael-solomon","description":"Michael G. Solomon, PhD, is a professor at the University of the Cumberlands who specializes in courses on blockchain and distributed computing systems as well as computer security. He holds numerous security and project management certifications and has written several books on security and project management, including Ethereum For Dummies. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/27199"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General Data Science","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":[{"articleId":192609,"title":"How to Pray the Rosary: A Comprehensive Guide","slug":"how-to-pray-the-rosary","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/192609"}},{"articleId":208741,"title":"Kabbalah For Dummies Cheat Sheet","slug":"kabbalah-for-dummies-cheat-sheet","categoryList":["body-mind-spirit","religion-spirituality","kabbalah"],"_links":{"self":"/articles/208741"}},{"articleId":230957,"title":"Nikon D3400 For Dummies Cheat Sheet","slug":"nikon-d3400-dummies-cheat-sheet","categoryList":["home-auto-hobbies","photography"],"_links":{"self":"/articles/230957"}},{"articleId":235851,"title":"Praying the Rosary and Meditating on the Mysteries","slug":"praying-rosary-meditating-mysteries","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/235851"}},{"articleId":284787,"title":"What Your Society Says About You","slug":"what-your-society-says-about-you","categoryList":["academics-the-arts","humanities"],"_links":{"self":"/articles/284787"}}],"inThisArticle":[{"label":"Leveraging newly accessible decentralized tools to analyze blockchain data","target":"#tab1"},{"label":"Monetizing blockchain data","target":"#tab2"},{"label":"Exchanging and integrating blockchain data effectively","target":"#tab3"}],"relatedArticles":{"fromBook":[],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":0,"slug":null,"isbn":null,"categoryList":null,"amazon":null,"image":null,"title":null,"testBankPinActivationLink":null,"bookOutOfPrint":false,"authorsInfo":null,"authors":null,"_links":null},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[null]}]\" id=\"du-slot-64833eaf30ef8\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[null]}]\" id=\"du-slot-64833eaf317a5\"></div></div>"},"articleType":{"articleType":"Articles","articleList":null,"content":null,"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2021-03-18T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":275229},{"headers":{"creationTime":"2016-03-27T16:46:47+00:00","modifiedTime":"2023-06-05T15:21:16+00:00","timestamp":"2023-06-05T18:01:02+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"Tableau For Dummies Cheat Sheet","strippedTitle":"tableau for dummies cheat sheet","slug":"tableau-for-dummies-cheat-sheet","canonicalUrl":"","seo":{"metaDescription":"As you're learning to use the Tableau business intelligence platform, use these tips for data layout and and cleansing options.","noIndex":0,"noFollow":0},"content":"Tableau is not a single application but rather a collection of applications that create a best-in-class business intelligence platform. You may want to dive right in and start trying to create magnificent visualizations, but there are a few concepts you should know about to refine your data and optimize visualizations.\r\n\r\nYou’ll need to determine whether your data set requires data cleansing. In that case, you’ll utilize Tableau Prep. If you want to collaborate and share your data, reports, and visualizations, you’ll use either Tableau Cloud or Tableau Server.\r\n\r\nCentral to the Tableau solution suite is Tableau Desktop; it’s at the heart of the creative engine for virtually all users at some point in time to create visualization renderings from workbooks, dashboards, and stories.\r\n\r\nKeep reading for tips about data layout and cleansing data in Tableau Prep.","description":"Tableau is not a single application but rather a collection of applications that create a best-in-class business intelligence platform. You may want to dive right in and start trying to create magnificent visualizations, but there are a few concepts you should know about to refine your data and optimize visualizations.\r\n\r\nYou’ll need to determine whether your data set requires data cleansing. In that case, you’ll utilize Tableau Prep. If you want to collaborate and share your data, reports, and visualizations, you’ll use either Tableau Cloud or Tableau Server.\r\n\r\nCentral to the Tableau solution suite is Tableau Desktop; it’s at the heart of the creative engine for virtually all users at some point in time to create visualization renderings from workbooks, dashboards, and stories.\r\n\r\nKeep reading for tips about data layout and cleansing data in Tableau Prep.","blurb":"","authors":[{"authorId":34674,"name":"Jack Hyman","slug":"jack-hyman","description":"<b>Jack Hyman</b> is chief executive officer of HyerTek, an IT consulting firm specializing in Microsoft’s business platforms. He is associate professor in the Computer Information Sciences department at the University of the Cumberlands. He has written several books in the <i>For Dummies</i> series, as well as certification study guides for the Microsoft Azure technology.","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/34674"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General Data Science","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":[{"articleId":192609,"title":"How to Pray the Rosary: A Comprehensive Guide","slug":"how-to-pray-the-rosary","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/192609"}},{"articleId":208741,"title":"Kabbalah For Dummies Cheat Sheet","slug":"kabbalah-for-dummies-cheat-sheet","categoryList":["body-mind-spirit","religion-spirituality","kabbalah"],"_links":{"self":"/articles/208741"}},{"articleId":230957,"title":"Nikon D3400 For Dummies Cheat Sheet","slug":"nikon-d3400-dummies-cheat-sheet","categoryList":["home-auto-hobbies","photography"],"_links":{"self":"/articles/230957"}},{"articleId":235851,"title":"Praying the Rosary and Meditating on the Mysteries","slug":"praying-rosary-meditating-mysteries","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/235851"}},{"articleId":284787,"title":"What Your Society Says About You","slug":"what-your-society-says-about-you","categoryList":["academics-the-arts","humanities"],"_links":{"self":"/articles/284787"}}],"inThisArticle":[],"relatedArticles":{"fromBook":[],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":292788,"slug":"tableau-for-dummies","isbn":"9781119684589","categoryList":["technology","information-technology","data-science","general-data-science"],"amazon":{"default":"https://www.amazon.com/gp/product/1119684587/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119684587/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119684587-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119684587/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119684587/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/tableau-for-dummies-cover-9781119684589-203x255.jpg","width":203,"height":255},"title":"Tableau For Dummies","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"<p><p><b><b data-author-id=\"9018\">Molly Monsey</b> </b>joined Tableau in 2009 as a technical product consultant. She and Paul Sochan work together to lead the Tableau training team. Today she recruits, trains, and supports instructors who educate Tableau users all over the world. <b>Paul Sochan </b>joined Tableau in 2010 and serves as the Senior Director of Global Education Services. The training team he built with <b data-author-id=\"9018\">Molly Monsey</b> develops all Tableau training offerings. Paul has been in the Business Intelligence space since 1994. <p><b>Molly Monsey </b>joined Tableau in 2009 as a technical product consultant. She and <b data-author-id=\"9019\">Paul Sochan</b> work together to lead the Tableau training team. Today she recruits, trains, and supports instructors who educate Tableau users all over the world. <b><b data-author-id=\"9019\">Paul Sochan</b> </b>joined Tableau in 2010 and serves as the Senior Director of Global Education Services. The training team he built with Molly Monsey develops all Tableau training offerings. Paul has been in the Business Intelligence space since 1994. <p><b>Molly Monsey </b>joined Tableau in 2009 as a technical product consultant. She and Paul Sochan work together to lead the Tableau training team. Today she recruits, trains, and supports instructors who educate Tableau users all over the world. <b>Paul Sochan </b>joined Tableau in 2010 and serves as the Senior Director of Global Education Services. The training team he built with Molly Monsey develops all Tableau training offerings. Paul has been in the Business Intelligence space since 1994.</p>","authors":[{"authorId":9018,"name":"Molly Monsey","slug":"molly-monsey","description":" <p><b>Molly Monsey </b>joined Tableau in 2009 as a technical product consultant. She and Paul Sochan work together to lead the Tableau training team. Today she recruits, trains, and supports instructors who educate Tableau users all over the world. <b>Paul Sochan </b>joined Tableau in 2010 and serves as the Senior Director of Global Education Services. The training team he built with Molly Monsey develops all Tableau training offerings. Paul has been in the Business Intelligence space since 1994. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9018"}},{"authorId":9019,"name":"Paul Sochan","slug":"paul-sochan","description":" <p><b>Molly Monsey </b>joined Tableau in 2009 as a technical product consultant. She and Paul Sochan work together to lead the Tableau training team. Today she recruits, trains, and supports instructors who educate Tableau users all over the world. <b>Paul Sochan </b>joined Tableau in 2010 and serves as the Senior Director of Global Education Services. The training team he built with Molly Monsey develops all Tableau training offerings. Paul has been in the Business Intelligence space since 1994. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9019"}},{"authorId":35298,"name":"Jack A. Hyman","slug":"jack-a-hyman","description":" <p><b>Molly Monsey </b>joined Tableau in 2009 as a technical product consultant. She and Paul Sochan work together to lead the Tableau training team. Today she recruits, trains, and supports instructors who educate Tableau users all over the world. <b>Paul Sochan </b>joined Tableau in 2010 and serves as the Senior Director of Global Education Services. The training team he built with Molly Monsey develops all Tableau training offerings. Paul has been in the Business Intelligence space since 1994. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/35298"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119684589&quot;]}]\" id=\"du-slot-647e22df164f2\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[&quot;9781119684589&quot;]}]\" id=\"du-slot-647e22df175fb\"></div></div>"},"articleType":{"articleType":"Cheat Sheet","articleList":[{"articleId":139328,"title":"A Tableau Glossary","slug":"a-tableau-glossary","categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/139328"}},{"articleId":139294,"title":"Keyboard Shortcuts for Tableau","slug":"keyboard-shortcuts-for-tableau","categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/139294"}},{"articleId":139298,"title":"Publishing to Tableau Server and Tableau Online","slug":"publishing-to-tableau-server-and-tableau-online","categoryList":[],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/139298"}}],"content":[{"title":"Selecting the right Tableau application","thumb":null,"image":null,"content":"<p>To suit your business objective, you have nine different applications of Tableau to choose from. Although Tableau Desktop, Tableau Prep Builder, and Tableau Cloud form the core of the Tableau offerings, that doesn’t mean that the other applications aren’t important. Each has its place in the Tableau Business Intelligence ecosystem. The following list describes the merit of each product:</p>\n<ul>\n<li><strong>Tableau Desktop</strong> is the cornerstone product of the suite, offering data visualization and analysis. It allows users to create interactive visualizations, dashboards, and reports by connecting to and analyzing data from various sources, including spreadsheets, databases, and cloud services.</li>\n<li><strong>Tableau Prep </strong>helps the analyst prepare, clean, and shape data for analysis and visualization using a visual, user-friendly interface.</li>\n<li><strong>Tableau Cloud </strong>is the cloud-based platform for data analytics and visualization management post-publishing in Tableau Desktop and hosted by Tableau. The platform allows users to store, share, and collaborate using Tableau workbooks, data sources, and dashboards in a secure, web-based environment.</li>\n<li><strong>Tableau Server</strong> is a business intelligence and analytics platform hosted by an enterprise organization that allows users to publish, share, and collaborate on interactive dashboards, reports, and visualizations much like those in Tableau Cloud. The big difference with this platform is that the organization is responsible for the infrastructure, not Tableau.</li>\n<li><strong>Tableau Public</strong> is the public platform for sharing data visualization and analysis assets developed by Tableau users online. Although Tableau Cloud or Tableau Server controls assets based on permissions, there are no permissions allowed with Tableau Public All the work created in Tableau Public is fully exposed for any user to consume, including the datasets, reports, dashboards, and visualizations.</li>\n<li><strong>CRM Analytics </strong>is the former Salesforce Einstein Analytics product. The platform developed by Salesforce integrates CRM Analytics data to visualize and analyze data, leveraging artificial intelligence.</li>\n<li><strong>Data Management</strong> is a collection of tools that help organizations prepare their data for analysis. The collection includes Tableau Prep, Tableau Catalog, and Tableau Server Management Add-Ons. Data Management is limited in functionality for Tableau Cloud.</li>\n<li><strong>Embedded Analytics </strong>enables end users to embed Tableau visualizations and interactive dashboards into their own applications and websites. You’ll need to utilize a variety of APIs, SDKs, and web connectors to enable the visualizations to support a complete user experience.</li>\n<li><strong>Advanced Management </strong>is a compilation of add-on features included across the products, such as advanced analytics models, data science integrations, governance and security support, and scalability and performance enhancements. Depending on your version of Tableau, one or more of these features can be accessed.</li>\n</ul>\n"},{"title":"Understanding data layout in Tableau","thumb":null,"image":null,"content":"<p>When you’re trying to craft your visualizations in Tableau, you’ll become very familiar with the green and blue pill. The <em>pill</em> is a visual representation of data, either a discrete or continuous measure (quantitative data) or discrete or continuous dimension (qualitative data).</p>\n<p>Here are some key rules when placing the pills on the Tableau shelves in Tableau Desktop:</p>\n<ul>\n<li>Dimensions generally turn into blue pills.</li>\n<li>Measures, a numeric value, are considered continuous in nature.</li>\n<li>Measures oftentimes turn into green pills.</li>\n</ul>\n<p>When you’re looking at a Tableau visualization, continuous variables take on any value with a range of values on a continuous scale. The range of values may even have decimals or fractional values. Examples include age, weight, temperature, salary, or height.</p>\n<p>Continuous variables in Tableau are typically displayed on an axis and can be visualized using line charts, scatter plots, or heat maps. In the following visualization, the dollar amount for the Federal Prison System is an example of a continuous scale.</p>\n<div class=\"figure-container\"><figure id=\"attachment_299161\" aria-labelledby=\"figcaption_attachment_299161\" class=\"wp-caption alignnone\" style=\"width: 545px\"><img loading=\"lazy\" class=\"size-full wp-image-299161\" src=\"https://www.dummies.com/wp-content/uploads/9781119684589-cs01.jpg\" alt=\"Graph showing example of a continuous scale of data\" width=\"535\" height=\"380\" /><figcaption id=\"figcaption_attachment_299161\" class=\"wp-caption-text\">©John Wiley &amp; Sons, Inc.<br />Example of a continuous scale</figcaption></figure></div><div class=\"clearfix\"></div>\n<p>Discrete variables can take on a limited number of distinct values. Consider discrete variables to be categorical in nature. They can’t take shape with fractions or decimals. Discrete variables include gender, race, city, state, zip codes, and country. Notice a pattern: There is only one option to pick from for each option.</p>\n<p>Discrete variables are displayed using bars or targeted points in visualizations such as bar charts, pie charts, or histograms. In the visualization just presented, each federal agency that has received a budget allocation represents a discrete value.</p>\n"},{"title":"Tips for cleansing data in Tableau prep","thumb":null,"image":null,"content":"<p>Most datasets have flaws. To ensure that your data is presented accurately in a visualization, a prerequisite step involves extracting, transporting, and loading (ETL) the data into Tableau Prep through the Connection pane, shown in the following figure).</p>\n<p>After the Connections pane processes the data via import, the table will either appear on the Flow pane (see the figure below), or you must drag one or more tables to the Flow pane. The Flow pane is also referred to as a canvas interchangeably.</p>\n<p>You can then add one or more steps to begin cleansing your data by pressing the + sign, also shown in the figure.</p>\n<div class=\"figure-container\"><figure id=\"attachment_299164\" aria-labelledby=\"figcaption_attachment_299164\" class=\"wp-caption alignnone\" style=\"width: 545px\"><img loading=\"lazy\" class=\"size-full wp-image-299164\" src=\"https://www.dummies.com/wp-content/uploads/9781119684589-cs02.jpg\" alt=\"Screenshot showing the Tableau Flow pane\" width=\"535\" height=\"323\" /><figcaption id=\"figcaption_attachment_299164\" class=\"wp-caption-text\">©John Wiley &amp; Sons, Inc.<br />The Tableau Flow pane</figcaption></figure></div><div class=\"clearfix\"></div>\n<p>Common cleansing options available throughout Tableau include:</p>\n<ul>\n<li><strong>Remove duplicates: </strong>Select Duplicate Field to remove duplicate rows in your dataset.</li>\n<li><strong>Rename fields: </strong>To make your field names more meaningful, you should rename the fields using Tableau Prep by selecting Rename Field.</li>\n<li><strong>Remove null values:</strong> Select Filter and then Null Values to eliminate null or missing values from your dataset.</li>\n<li><strong>Split columns: </strong>Select Split Values if you need to split a column into multiple columns. For example, there might be multiple values. Or perhaps you want to break up values from A–L and M–Z. Those are classic examples of where using the Split Values option is appropriate.</li>\n<li><strong>Change a data type: </strong>Sometimes Tableau doesn’t select the correct data type. For example, a field may be recognized as a string, but all the values are numeric. In that case, you select the Data Type icon in the left corner of each column to change the data type. The icon appears initially as the letters <em>ABC.</em></li>\n<li><strong>Clean:</strong> You may find erroneous characters that appear frequently in a column. Should that be the case, select Clean and then choose among these options that appear: Make Uppercase, Make Lowercase, Remove Numbers, Remove Letters, and Remove Punctuation.</li>\n<li><strong>Filter data:</strong> Use the Filter option to remove any unnecessary rows from your data, including selected or wildcard values.</li>\n</ul>\n<p>Although there are many other data cleansing options, these are Tableau’s most utilized ones. The image below provides an example of the menu where you can find all these filtering features to begin the data-cleansing journey in Tableau Prep.</p>\n<div class=\"figure-container\"><figure id=\"attachment_299163\" aria-labelledby=\"figcaption_attachment_299163\" class=\"wp-caption alignnone\" style=\"width: 327px\"><img loading=\"lazy\" class=\"size-full wp-image-299163\" src=\"https://www.dummies.com/wp-content/uploads/9781119684589-cs03.jpg\" alt=\"Screenshot showing Tableau's menu of data cleansing options\" width=\"317\" height=\"400\" /><figcaption id=\"figcaption_attachment_299163\" class=\"wp-caption-text\">©John Wiley &amp; Sons, Inc.<br />Tableau menu for data cleansing options</figcaption></figure></div><div class=\"clearfix\"></div>\n"}],"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"Two years","lifeExpectancySetFrom":"2023-06-05T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":207419},{"headers":{"creationTime":"2020-12-29T19:51:10+00:00","modifiedTime":"2022-08-04T20:14:48+00:00","timestamp":"2022-09-14T18:19:51+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33577"},"slug":"data-science","categoryId":33577},{"name":"General Data Science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"},"slug":"general-data-science","categoryId":33580}],"title":"10 Uses for Blockchain Analytics","strippedTitle":"10 uses for blockchain analytics","slug":"10-uses-for-blockchain-analytics","canonicalUrl":"","seo":{"metaDescription":"Are you curious to learn how blockchain data analytics can be used? Check out this list of top ten uses, from Dummies.com.","noIndex":0,"noFollow":0},"content":"A common question from management when first considering data analytics and again in the specific context of blockchain is “Why do we need this?” Your organization will have to answer that question, in general, and you’ll need to explain why building and executing analytics models on your blockchain data will benefit your organization. Without an expected return on investment (ROI), management probably won't authorize and fund any analytics efforts.\r\n\r\nThe good news is that you aren’t the pioneer in blockchain analytics. Other organizations of all sizes have seen the value of formal analysis of blockchain data. Examining what other organizations have done can be encouraging and insightful. You’ll probably find some fresh ideas as you familiarize yourself with what others have accomplished with their blockchain analytics projects.\r\n\r\nHere, you learn about ten ways in which blockchain analytics can be useful to today’s (and tomorrow’s) organizations. Blockchain analytics focuses on analyzing what happened in the past, explaining what's happening now, and even preparing for what's expected to come in the future. Analytics can help any organization react, understand, prepare, and lower overall risk.\r\n<h2 id=\"tab1\" >Accessing public financial transaction data</h2>\r\nThe first blockchain implementation, <a href=\"https://www.dummies.com/software/bitcoin-basics/\" target=\"_blank\" rel=\"noopener\">Bitcoin</a>, is all about cryptocurrency, so it stands to reason that examining financial transactions would be an obvious use of blockchain analytics. If tracking transactions was your first thought of how to use blockchain analytics, you’d be right. Bitcoin and other blockchain cryptocurrencies used to be viewed as completely anonymous methods of executing financial transactions.\r\n\r\nThe flawed perception of complete anonymity enticed criminals to use the new type of currency to conduct illegal business. Since cryptocurrency accounts aren’t directly associated with real-world identities (at least on the blockchain), any users who wanted to conduct secret business warmed up to Bitcoin and other cryptocurrencies.\r\n\r\nWhen law enforcement noticed the growth in cryptocurrency transactions, they began looking for ways to re-identify transactions of interest. It turns out that with a little effort and proper legal authority, it isn’t that hard to figure out who owns a cryptocurrency account. When a cryptocurrency account is converted and transferred to a traditional account, many criminals are unmasked. Law enforcement became an early adopter of blockchain analytics and still uses models today to help identify suspected criminal and fraudulent activity.\r\n\r\nChainalysis is a company that specializes in cryptocurrency investigations. Their product, <a href=\"https://www.chainalysis.com/chainalysis-reactor/\" target=\"_blank\" rel=\"noopener\">Chainalysis Reactor</a>, allows users to conduct cryptocurrency forensics to connect transactions to real-world identities. The image shows the Chainalysis Reactor tool.\r\n\r\n[caption id=\"attachment_275154\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275154 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-chainalysis.png\" alt=\"Chainalysis Reactor\" width=\"556\" height=\"280\" /> Chainalysis Reactor[/caption]\r\n\r\nBut blockchain technology isn’t just for criminals, and blockchain analytics isn’t just to catch bad guys. The growing popularity of blockchain and cryptocurrencies could lead to new ways to evaluate entire industries, P2P transactions, currency flow, the wealth of nation-states, and a variety of other market valuations with this new area of analysis. For example, <a href=\"https://www.dummies.com/personal-finance/top-10-ethereum-uses/\" target=\"_blank\" rel=\"noopener\">Ethereum</a> has emerged as a major avenue of fundraising for tech startups, and its analysis could lend a deeper look into the industry.\r\n<h2 id=\"tab2\" >Connecting with the Internet of Things (IoT)</h2>\r\nThe <em>Internet of Things (IoT) </em>is loosely defined as the collection of devices of all sizes that are connected to the internet and operate at some level with little human interaction. IoT devices include doorbell cameras, remote temperature sensors, undersea oil leak detectors, refrigerators, and vehicle components. The list is almost endless, as is the number of devices connecting to the internet.\r\n\r\nEach IoT device has a unique identity and produces and consumes data. All of these devices need some entity that manages data exchange and the device’s operation. Although most IoT devices are autonomous (they operate without the need for external guidance), all devices eventually need to request or send data to someone. But that someone doesn’t have to be a human.\r\n\r\nCurrently, the centralized nature of traditional IoT systems reduces their scalability and can create bottlenecks. A central management entity can handle only a limited number of devices. Many companies working in the IoT space are looking to leverage the smart contracts in blockchain networks to allow IoT devices to work more securely and autonomously. These smart contracts are becoming increasingly attractive as the number of IoT devices exceeds 20 billion worldwide in 2020.\r\n\r\nThe figure below shows how IoT has matured from a purely centralized network in the past to a distributed network (which still had some central hubs) to a vision of the future without the need for central managers.\r\n\r\n[caption id=\"attachment_275155\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275155 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-iot.png\" alt=\"IoT\" width=\"556\" height=\"222\" /> Moving toward distributed, autonomous IoT[/caption]\r\n\r\nThe applications of IoT data are endless, and if the industry does shift in this direction, knowing and understanding blockchain analytics will be necessary to truly unlock its potential. Using blockchain technology to manage IoT devices is only the beginning. Without the application of analytics to really understand the huge volume of data IoT devices will be generating, much of the value of having so many autonomous devices will be lost.\r\n<h2 id=\"tab3\" >Ensuring data and document authenticity</h2>\r\nThe Lenovo Group is a multinational technology company that manufactures and distributes consumer electronics. During a business process review, Lenovo identified several areas of inefficiency in their supply chain. After analyzing the issues, they decided to incorporate blockchain technology to increase visibility, consistency, and autonomy, and to decrease waste and process delays. Lenovo published a paper, <a href=\"https://lenovopress.com/lp1221.pdf\" target=\"_blank\" rel=\"noopener\">“Blockchain Technology for Business: A Lenovo Point of View,”</a> detailing their efforts and results.\r\n\r\nIn addition to describing their supply chain application of blockchain technology in their paper, Lenovo cited examples of how the <em>New York Times</em> uses blockchain to prove that photos are authentic. They also described how the city of Dubai is working to have all its government documents on blockchain by the end of 2020 in an effort to crack down on corruption and the misuse of funds.\r\n\r\nIn the era of deep fakes, manipulated photos and consistently evolving methods of corruption and misappropriation of funds, blockchain can help identify cases of data fraud and misuse. Blockchain’s inherent transparency and immutability means that data cannot be retroactively manipulated to support a narrative. Facts in a blockchain are recorded as unchangeable facts. Analytics models can help researchers understand how data of any type originated, who the original owner was, how it gets amended over time, and if any amendments are coordinated.\r\n<h2 id=\"tab4\" >Controlling secure document integrity</h2>\r\nAs just mentioned, blockchain technology can be used to ensure document authenticity, but it can be used also to ensure document integrity. In areas where documents should not be able to be altered, such as the legal and healthcare industries, blockchain can help make documents and changes to them transparent and immutable, as well as increase the power the owner of the data has to control and manage it.\r\n\r\nDocuments do not have to be stored in the blockchain to benefit from the technology. Documents can be stored in off-chain repositories, with a hash stored in a block on the blockchain. Each transaction (required to write to a new block) contains the owner’s account and a timestamp of the action. The integrity of any document at a specific point in time can be validated simply by comparing the on-chain hash with the calculated hash value of the document. If the hash values match, the document has not changed since the blockchain transaction was created.\r\n\r\nThe company <a href=\"https://docstamp.io/\" target=\"_blank\" rel=\"noopener\">DocStamp</a> has implemented a novel use for blockchain document management. Using DocStamp, shown below, anyone can self-notarize any document. The document owner maintains control of the document while storing a hash of the document on an Ethereum blockchain.\r\n\r\n[caption id=\"attachment_275156\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275156 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-.png\" alt=\"DocStamp\" width=\"556\" height=\"313\" /> The DocStamp website[/caption]\r\n<p class=\"article-tips warning\">Services such as DocStamp provide the capability to ensure document integrity using blockchain technology. However, assessing document integrity and its use is up to analytics models. The DocStamp model is not generally recognized by courts of law to be as strong as a traditional notary. For that to change, analysts will need to provide model results that show how the approach works and how blockchain can help provide evidence that document integrity is ensured.</p>\r\n\r\n<h2 id=\"tab5\" >Tracking supply chain items</h2>\r\nIn the Lenovo blockchain paper, the author described how Lenovo replaced printed paperwork in its supply chain with processes managed through smart contracts. The switch to blockchain-based process management greatly decreased the potential for human error and removed many human-related process delays. Replacing human interaction with electronic transaction increased auditability and gave all parties more transparency in the movement of goods. The Lenovo supply chain became more efficient and easier to investigate.\r\n\r\nBlockchain-based supply chain solutions are one of the most popular ways to implement blockchain technology. Blockchain technology makes it easy to track items along the supply chain, both forward and backward. The capability to track an item makes it easy to determine where an item is and where that item has been. Tracing an item’s provenance, or origin, makes root cause analysis possible. Because the blockchain keeps all history of movement through the supply chain, many types of analysis are easier than traditional data stores which can overwrite data.\r\n\r\nThe US Food and Drug Administration is working with several private firms to evaluate using blockchain technology supply chain applications to identify, track, and trace prescription drugs. Analysis of the blockchain data can provide evidence for identifying counterfeit drugs and delivery paths criminals use to get those drugs to market.\r\n<h2 id=\"tab6\" >Empowering predictive analytics</h2>\r\nYou can build several models that allow you to predict future behavior based on past observations. Predictive analytics is often one of the goals of an organization’s analytics projects. Large organizations may already have a collection of data that supports prediction. Smaller organizations, however, probably lack enough data to make accurate predictions. Even large organizations would still benefit from datasets that extend beyond their own customers and partners.\r\n\r\nIn the past, a common approach to acquiring enough data for meaningful analysis was to purchase data from an aggregator. Each data acquisition request costs money, and the data you receive may still be limited in scope. The prospect of using public blockchains has the potential to change the way we all access public data. If a majority of supply chain interactions, for example, use a public blockchain, that data is available to anyone — for free.\r\n\r\nAs more organizations incorporate blockchains into their operations, analysts could leverage the additional data to empower more companies to use predictive analytics with less reliance on localized data.\r\n<h2 id=\"tab7\" >Analyzing real-time data</h2>\r\nBlockchain transactions happen in real time, across intranational and international borders. Not only are banks and innovators in financial technology pursuing blockchain for the speed it offers to transactions, but data scientists and analysts are observing blockchain data changes and additions in real time, greatly increasing the potential for fast decision-making.\r\n\r\nTo view how dynamic blockchain data really is, visit the <a href=\"http://ethviewer.live/\" target=\"_blank\" rel=\"noopener\">Ethviewer</a> Ethereum blockchain monitor’s website. The following image shows the Ethviewer website.\r\n\r\n[caption id=\"attachment_275157\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275157 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-ethviewer.png\" alt=\"Ethviewer Ethereum blockchain monitor\" width=\"556\" height=\"313\" /> Ethviewer Ethereum blockchain monitor[/caption]\r\n\r\nEach small circle in the blob near the lower-left corner of the web page is a distinct transaction waiting to make it into a new block. You can see how dynamic the Ethereum blockchain is — it changes constantly. And when the blockchain changes, so does the blockchain data that your models use to provide accurate results.\r\n<h2 id=\"tab8\" >Supercharging business strategy</h2>\r\nCompanies big and small — marketing firms, financial technology giants, small local retailers, and many more — can fine-tune their strategies to keep up with, and even get ahead of, shifts in the market, the economy, and their customer base. How? By utilizing the results of analytics models built on the organization’s blockchain data.\r\n\r\nThe ultimate goal for any analytics project is to provide ROI for the sponsoring organization. <a href=\"https://www.dummies.com/programming/big-data/blockchain-data-analytics-for-dummies-cheat-sheet/\" target=\"_blank\" rel=\"noopener\">Blockchain analytics</a> projects provide a unique opportunity to provide value. New blockchain implementations are only recently becoming common in organizations, and now is the time to view those sources of data as new opportunities to provide value. Analytics can help identify potential sources of ROI.\r\n<h2 id=\"tab9\" >Managing data sharing</h2>\r\nBlockchain technology is often referred to as a disruptive technology, and there is some truth to that characterization. Blockchain does disrupt many things. In the context of data analytics, blockchain changes the way analysts acquire at least some of their data.\r\n\r\nIf a public or consortium blockchain is the source for an analytics model, it's a near certainty that the sponsoring organization does not own all the data. Much of the data in a non-private blockchain comes from other entities that decided to place the data in a shared repository, the blockchain.\r\n\r\nBlockchain can aid in the storage of data in a distributed network and make that data easily accessible to project teams. Easy access to data makes the whole analytics process easier. There still may be a lot of work to do, but you can always count on the facts that blockchain data is accessible and it hasn’t changed since it was written. Blockchain makes collaboration among data analysts and other data consumers easier than with more traditional data repositories.\r\n<h2 id=\"tab10\" >Standardizing collaboration forms</h2>\r\nBlockchain technology empowers analytics in more ways than just providing access to more data. Regardless of whether blockchain technology is deployed in the healthcare, legal, government, or other organizational domain, blockchain can lead to more efficient process automation.\r\n\r\nAlso, blockchain’s revolutionary approach to how data is generated and shared among parties can lead to better and greater standardization in how end users populate forms and how other data gets collected. Blockchains can help encourage adherence to agreed-upon standards for data handling.\r\n\r\nThe use of data-handling standards will greatly decrease the amount of time necessary for data cleaning and management. Because cleansing data commonly requires a large time investment in the analytics process, standardization through the use of blockchain can make it easier to build and modify models with a short time-to-market.","description":"A common question from management when first considering data analytics and again in the specific context of blockchain is “Why do we need this?” Your organization will have to answer that question, in general, and you’ll need to explain why building and executing analytics models on your blockchain data will benefit your organization. Without an expected return on investment (ROI), management probably won't authorize and fund any analytics efforts.\r\n\r\nThe good news is that you aren’t the pioneer in blockchain analytics. Other organizations of all sizes have seen the value of formal analysis of blockchain data. Examining what other organizations have done can be encouraging and insightful. You’ll probably find some fresh ideas as you familiarize yourself with what others have accomplished with their blockchain analytics projects.\r\n\r\nHere, you learn about ten ways in which blockchain analytics can be useful to today’s (and tomorrow’s) organizations. Blockchain analytics focuses on analyzing what happened in the past, explaining what's happening now, and even preparing for what's expected to come in the future. Analytics can help any organization react, understand, prepare, and lower overall risk.\r\n<h2 id=\"tab1\" >Accessing public financial transaction data</h2>\r\nThe first blockchain implementation, <a href=\"https://www.dummies.com/software/bitcoin-basics/\" target=\"_blank\" rel=\"noopener\">Bitcoin</a>, is all about cryptocurrency, so it stands to reason that examining financial transactions would be an obvious use of blockchain analytics. If tracking transactions was your first thought of how to use blockchain analytics, you’d be right. Bitcoin and other blockchain cryptocurrencies used to be viewed as completely anonymous methods of executing financial transactions.\r\n\r\nThe flawed perception of complete anonymity enticed criminals to use the new type of currency to conduct illegal business. Since cryptocurrency accounts aren’t directly associated with real-world identities (at least on the blockchain), any users who wanted to conduct secret business warmed up to Bitcoin and other cryptocurrencies.\r\n\r\nWhen law enforcement noticed the growth in cryptocurrency transactions, they began looking for ways to re-identify transactions of interest. It turns out that with a little effort and proper legal authority, it isn’t that hard to figure out who owns a cryptocurrency account. When a cryptocurrency account is converted and transferred to a traditional account, many criminals are unmasked. Law enforcement became an early adopter of blockchain analytics and still uses models today to help identify suspected criminal and fraudulent activity.\r\n\r\nChainalysis is a company that specializes in cryptocurrency investigations. Their product, <a href=\"https://www.chainalysis.com/chainalysis-reactor/\" target=\"_blank\" rel=\"noopener\">Chainalysis Reactor</a>, allows users to conduct cryptocurrency forensics to connect transactions to real-world identities. The image shows the Chainalysis Reactor tool.\r\n\r\n[caption id=\"attachment_275154\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275154 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-chainalysis.png\" alt=\"Chainalysis Reactor\" width=\"556\" height=\"280\" /> Chainalysis Reactor[/caption]\r\n\r\nBut blockchain technology isn’t just for criminals, and blockchain analytics isn’t just to catch bad guys. The growing popularity of blockchain and cryptocurrencies could lead to new ways to evaluate entire industries, P2P transactions, currency flow, the wealth of nation-states, and a variety of other market valuations with this new area of analysis. For example, <a href=\"https://www.dummies.com/personal-finance/top-10-ethereum-uses/\" target=\"_blank\" rel=\"noopener\">Ethereum</a> has emerged as a major avenue of fundraising for tech startups, and its analysis could lend a deeper look into the industry.\r\n<h2 id=\"tab2\" >Connecting with the Internet of Things (IoT)</h2>\r\nThe <em>Internet of Things (IoT) </em>is loosely defined as the collection of devices of all sizes that are connected to the internet and operate at some level with little human interaction. IoT devices include doorbell cameras, remote temperature sensors, undersea oil leak detectors, refrigerators, and vehicle components. The list is almost endless, as is the number of devices connecting to the internet.\r\n\r\nEach IoT device has a unique identity and produces and consumes data. All of these devices need some entity that manages data exchange and the device’s operation. Although most IoT devices are autonomous (they operate without the need for external guidance), all devices eventually need to request or send data to someone. But that someone doesn’t have to be a human.\r\n\r\nCurrently, the centralized nature of traditional IoT systems reduces their scalability and can create bottlenecks. A central management entity can handle only a limited number of devices. Many companies working in the IoT space are looking to leverage the smart contracts in blockchain networks to allow IoT devices to work more securely and autonomously. These smart contracts are becoming increasingly attractive as the number of IoT devices exceeds 20 billion worldwide in 2020.\r\n\r\nThe figure below shows how IoT has matured from a purely centralized network in the past to a distributed network (which still had some central hubs) to a vision of the future without the need for central managers.\r\n\r\n[caption id=\"attachment_275155\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275155 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-iot.png\" alt=\"IoT\" width=\"556\" height=\"222\" /> Moving toward distributed, autonomous IoT[/caption]\r\n\r\nThe applications of IoT data are endless, and if the industry does shift in this direction, knowing and understanding blockchain analytics will be necessary to truly unlock its potential. Using blockchain technology to manage IoT devices is only the beginning. Without the application of analytics to really understand the huge volume of data IoT devices will be generating, much of the value of having so many autonomous devices will be lost.\r\n<h2 id=\"tab3\" >Ensuring data and document authenticity</h2>\r\nThe Lenovo Group is a multinational technology company that manufactures and distributes consumer electronics. During a business process review, Lenovo identified several areas of inefficiency in their supply chain. After analyzing the issues, they decided to incorporate blockchain technology to increase visibility, consistency, and autonomy, and to decrease waste and process delays. Lenovo published a paper, <a href=\"https://lenovopress.com/lp1221.pdf\" target=\"_blank\" rel=\"noopener\">“Blockchain Technology for Business: A Lenovo Point of View,”</a> detailing their efforts and results.\r\n\r\nIn addition to describing their supply chain application of blockchain technology in their paper, Lenovo cited examples of how the <em>New York Times</em> uses blockchain to prove that photos are authentic. They also described how the city of Dubai is working to have all its government documents on blockchain by the end of 2020 in an effort to crack down on corruption and the misuse of funds.\r\n\r\nIn the era of deep fakes, manipulated photos and consistently evolving methods of corruption and misappropriation of funds, blockchain can help identify cases of data fraud and misuse. Blockchain’s inherent transparency and immutability means that data cannot be retroactively manipulated to support a narrative. Facts in a blockchain are recorded as unchangeable facts. Analytics models can help researchers understand how data of any type originated, who the original owner was, how it gets amended over time, and if any amendments are coordinated.\r\n<h2 id=\"tab4\" >Controlling secure document integrity</h2>\r\nAs just mentioned, blockchain technology can be used to ensure document authenticity, but it can be used also to ensure document integrity. In areas where documents should not be able to be altered, such as the legal and healthcare industries, blockchain can help make documents and changes to them transparent and immutable, as well as increase the power the owner of the data has to control and manage it.\r\n\r\nDocuments do not have to be stored in the blockchain to benefit from the technology. Documents can be stored in off-chain repositories, with a hash stored in a block on the blockchain. Each transaction (required to write to a new block) contains the owner’s account and a timestamp of the action. The integrity of any document at a specific point in time can be validated simply by comparing the on-chain hash with the calculated hash value of the document. If the hash values match, the document has not changed since the blockchain transaction was created.\r\n\r\nThe company <a href=\"https://docstamp.io/\" target=\"_blank\" rel=\"noopener\">DocStamp</a> has implemented a novel use for blockchain document management. Using DocStamp, shown below, anyone can self-notarize any document. The document owner maintains control of the document while storing a hash of the document on an Ethereum blockchain.\r\n\r\n[caption id=\"attachment_275156\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275156 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-.png\" alt=\"DocStamp\" width=\"556\" height=\"313\" /> The DocStamp website[/caption]\r\n<p class=\"article-tips warning\">Services such as DocStamp provide the capability to ensure document integrity using blockchain technology. However, assessing document integrity and its use is up to analytics models. The DocStamp model is not generally recognized by courts of law to be as strong as a traditional notary. For that to change, analysts will need to provide model results that show how the approach works and how blockchain can help provide evidence that document integrity is ensured.</p>\r\n\r\n<h2 id=\"tab5\" >Tracking supply chain items</h2>\r\nIn the Lenovo blockchain paper, the author described how Lenovo replaced printed paperwork in its supply chain with processes managed through smart contracts. The switch to blockchain-based process management greatly decreased the potential for human error and removed many human-related process delays. Replacing human interaction with electronic transaction increased auditability and gave all parties more transparency in the movement of goods. The Lenovo supply chain became more efficient and easier to investigate.\r\n\r\nBlockchain-based supply chain solutions are one of the most popular ways to implement blockchain technology. Blockchain technology makes it easy to track items along the supply chain, both forward and backward. The capability to track an item makes it easy to determine where an item is and where that item has been. Tracing an item’s provenance, or origin, makes root cause analysis possible. Because the blockchain keeps all history of movement through the supply chain, many types of analysis are easier than traditional data stores which can overwrite data.\r\n\r\nThe US Food and Drug Administration is working with several private firms to evaluate using blockchain technology supply chain applications to identify, track, and trace prescription drugs. Analysis of the blockchain data can provide evidence for identifying counterfeit drugs and delivery paths criminals use to get those drugs to market.\r\n<h2 id=\"tab6\" >Empowering predictive analytics</h2>\r\nYou can build several models that allow you to predict future behavior based on past observations. Predictive analytics is often one of the goals of an organization’s analytics projects. Large organizations may already have a collection of data that supports prediction. Smaller organizations, however, probably lack enough data to make accurate predictions. Even large organizations would still benefit from datasets that extend beyond their own customers and partners.\r\n\r\nIn the past, a common approach to acquiring enough data for meaningful analysis was to purchase data from an aggregator. Each data acquisition request costs money, and the data you receive may still be limited in scope. The prospect of using public blockchains has the potential to change the way we all access public data. If a majority of supply chain interactions, for example, use a public blockchain, that data is available to anyone — for free.\r\n\r\nAs more organizations incorporate blockchains into their operations, analysts could leverage the additional data to empower more companies to use predictive analytics with less reliance on localized data.\r\n<h2 id=\"tab7\" >Analyzing real-time data</h2>\r\nBlockchain transactions happen in real time, across intranational and international borders. Not only are banks and innovators in financial technology pursuing blockchain for the speed it offers to transactions, but data scientists and analysts are observing blockchain data changes and additions in real time, greatly increasing the potential for fast decision-making.\r\n\r\nTo view how dynamic blockchain data really is, visit the <a href=\"http://ethviewer.live/\" target=\"_blank\" rel=\"noopener\">Ethviewer</a> Ethereum blockchain monitor’s website. The following image shows the Ethviewer website.\r\n\r\n[caption id=\"attachment_275157\" align=\"aligncenter\" width=\"556\"]<img class=\"wp-image-275157 size-full\" src=\"https://www.dummies.com/wp-content/uploads/blockchain-data-analytics-ethviewer.png\" alt=\"Ethviewer Ethereum blockchain monitor\" width=\"556\" height=\"313\" /> Ethviewer Ethereum blockchain monitor[/caption]\r\n\r\nEach small circle in the blob near the lower-left corner of the web page is a distinct transaction waiting to make it into a new block. You can see how dynamic the Ethereum blockchain is — it changes constantly. And when the blockchain changes, so does the blockchain data that your models use to provide accurate results.\r\n<h2 id=\"tab8\" >Supercharging business strategy</h2>\r\nCompanies big and small — marketing firms, financial technology giants, small local retailers, and many more — can fine-tune their strategies to keep up with, and even get ahead of, shifts in the market, the economy, and their customer base. How? By utilizing the results of analytics models built on the organization’s blockchain data.\r\n\r\nThe ultimate goal for any analytics project is to provide ROI for the sponsoring organization. <a href=\"https://www.dummies.com/programming/big-data/blockchain-data-analytics-for-dummies-cheat-sheet/\" target=\"_blank\" rel=\"noopener\">Blockchain analytics</a> projects provide a unique opportunity to provide value. New blockchain implementations are only recently becoming common in organizations, and now is the time to view those sources of data as new opportunities to provide value. Analytics can help identify potential sources of ROI.\r\n<h2 id=\"tab9\" >Managing data sharing</h2>\r\nBlockchain technology is often referred to as a disruptive technology, and there is some truth to that characterization. Blockchain does disrupt many things. In the context of data analytics, blockchain changes the way analysts acquire at least some of their data.\r\n\r\nIf a public or consortium blockchain is the source for an analytics model, it's a near certainty that the sponsoring organization does not own all the data. Much of the data in a non-private blockchain comes from other entities that decided to place the data in a shared repository, the blockchain.\r\n\r\nBlockchain can aid in the storage of data in a distributed network and make that data easily accessible to project teams. Easy access to data makes the whole analytics process easier. There still may be a lot of work to do, but you can always count on the facts that blockchain data is accessible and it hasn’t changed since it was written. Blockchain makes collaboration among data analysts and other data consumers easier than with more traditional data repositories.\r\n<h2 id=\"tab10\" >Standardizing collaboration forms</h2>\r\nBlockchain technology empowers analytics in more ways than just providing access to more data. Regardless of whether blockchain technology is deployed in the healthcare, legal, government, or other organizational domain, blockchain can lead to more efficient process automation.\r\n\r\nAlso, blockchain’s revolutionary approach to how data is generated and shared among parties can lead to better and greater standardization in how end users populate forms and how other data gets collected. Blockchains can help encourage adherence to agreed-upon standards for data handling.\r\n\r\nThe use of data-handling standards will greatly decrease the amount of time necessary for data cleaning and management. Because cleansing data commonly requires a large time investment in the analytics process, standardization through the use of blockchain can make it easier to build and modify models with a short time-to-market.","blurb":"","authors":[{"authorId":27199,"name":"Michael G. Solomon","slug":"michael-solomon","description":"Michael G. Solomon, PhD, is a professor at the University of the Cumberlands who specializes in courses on blockchain and distributed computing systems as well as computer security. He holds numerous security and project management certifications and has written several books on security and project management, including Ethereum For Dummies. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/27199"}}],"primaryCategoryTaxonomy":{"categoryId":33580,"title":"General Data Science","slug":"general-data-science","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33580"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":[{"articleId":192609,"title":"How to Pray the Rosary: A Comprehensive Guide","slug":"how-to-pray-the-rosary","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/192609"}},{"articleId":208741,"title":"Kabbalah For Dummies Cheat Sheet","slug":"kabbalah-for-dummies-cheat-sheet","categoryList":["body-mind-spirit","religion-spirituality","kabbalah"],"_links":{"self":"/articles/208741"}},{"articleId":230957,"title":"Nikon D3400 For Dummies Cheat Sheet","slug":"nikon-d3400-dummies-cheat-sheet","categoryList":["home-auto-hobbies","photography"],"_links":{"self":"/articles/230957"}},{"articleId":235851,"title":"Praying the Rosary and Meditating on the Mysteries","slug":"praying-rosary-meditating-mysteries","categoryList":["body-mind-spirit","religion-spirituality","christianity","catholicism"],"_links":{"self":"/articles/235851"}},{"articleId":284787,"title":"What Your Society Says About You","slug":"what-your-society-says-about-you","categoryList":["academics-the-arts","humanities"],"_links":{"self":"/articles/284787"}}],"inThisArticle":[{"label":"Accessing public financial transaction data","target":"#tab1"},{"label":"Connecting with the Internet of Things (IoT)","target":"#tab2"},{"label":"Ensuring data and document authenticity","target":"#tab3"},{"label":"Controlling secure document integrity","target":"#tab4"},{"label":"Tracking supply chain items","target":"#tab5"},{"label":"Empowering predictive analytics","target":"#tab6"},{"label":"Analyzing real-time data","target":"#tab7"},{"label":"Supercharging business strategy","target":"#tab8"},{"label":"Managing data sharing","target":"#tab9"},{"label":"Standardizing collaboration forms","target":"#tab10"}],"relatedArticles":{"fromBook":[],"fromCategory":[{"articleId":289776,"title":"Decision Intelligence For Dummies Cheat Sheet","slug":"decision-intelligence-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289776"}},{"articleId":289744,"title":"Microsoft Power BI For Dummies Cheat Sheet","slug":"microsoft-power-bi-for-dummies-cheat-sheet","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/289744"}},{"articleId":275249,"title":"Laws and Regulations You Should Know for Blockchain Data Analysis Projects","slug":"laws-and-regulations-you-should-know-for-blockchain-data-analysis-projects","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275249"}},{"articleId":275244,"title":"Aligning Blockchain Data with Real-World Business Processes","slug":"aligning-blockchain-data-with-real-world-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275244"}},{"articleId":275239,"title":"Fitting Blockchain into Today’s Business Processes","slug":"fitting-blockchain-into-todays-business-processes","categoryList":["technology","information-technology","data-science","general-data-science"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/275239"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":0,"slug":null,"isbn":null,"categoryList":null,"amazon":null,"image":null,"title":null,"testBankPinActivationLink":null,"bookOutOfPrint":false,"authorsInfo":null,"authors":null,"_links":null},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[null]}]\" id=\"du-slot-63221b47a9937\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{&quot;key&quot;:&quot;cat&quot;,&quot;values&quot;:[&quot;technology&quot;,&quot;information-technology&quot;,&quot;data-science&quot;,&quot;general-data-science&quot;]},{&quot;key&quot;:&quot;isbn&quot;,&quot;values&quot;:[null]}]\" id=\"du-slot-63221b47aa1b6\"></div></div>"},"articleType":{"articleType":"Articles","articleList":null,"content":null,"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":"One year","lifeExpectancySetFrom":"2021-03-18T00:00:00+00:00","dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":275153}],"_links":{"self":{"self":"https://dummies-api.dummies.com/v2/categories/33577/categoryArticles?sortField=time&sortOrder=1&size=10&offset=0"},"next":{"self":"https://dummies-api.dummies.com/v2/categories/33577/categoryArticles?sortField=time&sortOrder=1&size=10&offset=10"},"last":{"self":"https://dummies-api.dummies.com/v2/categories/33577/categoryArticles?sortField=time&sortOrder=1&size=10&offset=359"}}},"objectTitle":"","status":"success","pageType":"article-category","objectId":"33577","page":1,"sortField":"time","sortOrder":1,"categoriesIds":[],"articleTypes":[],"filterData":{"categoriesFilter":[{"itemId":0,"itemName":"All Categories","count":369},{"itemId":33578,"itemName":"Big Data","count":174},{"itemId":33579,"itemName":"Databases","count":8},{"itemId":33580,"itemName":"General Data Science","count":173},{"itemId":34365,"itemName":"Web Analytics","count":14}],"articleTypeFilter":[{"articleType":"All Types","count":369},{"articleType":"Articles","count":343},{"articleType":"Cheat Sheet","count":18},{"articleType":"Step by Step","count":8}]},"filterDataLoadedStatus":"success","pageSize":10},"adsState":{"pageScripts":{"headers":{"timestamp":"2025-04-17T15:50:01+00:00"},"adsId":0,"data":{"scripts":[{"pages":["all"],"location":"header","script":"<!--Optimizely Script-->\r\n<script src=\"https://cdn.optimizely.com/js/10563184655.js\"></script>","enabled":false},{"pages":["all"],"location":"header","script":"<!-- comScore Tag -->\r\n<script>var _comscore = _comscore || [];_comscore.push({ c1: \"2\", c2: \"15097263\" });(function() {var s = document.createElement(\"script\"), el = document.getElementsByTagName(\"script\")[0]; s.async = true;s.src = (document.location.protocol == \"https:\" ? \"https://sb\" : \"http://b\") + \".scorecardresearch.com/beacon.js\";el.parentNode.insertBefore(s, el);})();</script><noscript><img src=\"https://sb.scorecardresearch.com/p?c1=2&c2=15097263&cv=2.0&cj=1\" /></noscript>\r\n<!-- / comScore Tag -->","enabled":true},{"pages":["all"],"location":"footer","script":"<!--BEGIN QUALTRICS WEBSITE FEEDBACK SNIPPET-->\r\n<script type='text/javascript'>\r\n(function(){var g=function(e,h,f,g){\r\nthis.get=function(a){for(var a=a+\"=\",c=document.cookie.split(\";\"),b=0,e=c.length;b<e;b++){for(var d=c[b];\" \"==d.charAt(0);)d=d.substring(1,d.length);if(0==d.indexOf(a))return d.substring(a.length,d.length)}return null};\r\nthis.set=function(a,c){var b=\"\",b=new Date;b.setTime(b.getTime()+6048E5);b=\"; expires=\"+b.toGMTString();document.cookie=a+\"=\"+c+b+\"; path=/; \"};\r\nthis.check=function(){var a=this.get(f);if(a)a=a.split(\":\");else if(100!=e)\"v\"==h&&(e=Math.random()>=e/100?0:100),a=[h,e,0],this.set(f,a.join(\":\"));else return!0;var c=a[1];if(100==c)return!0;switch(a[0]){case \"v\":return!1;case \"r\":return c=a[2]%Math.floor(100/c),a[2]++,this.set(f,a.join(\":\")),!c}return!0};\r\nthis.go=function(){if(this.check()){var a=document.createElement(\"script\");a.type=\"text/javascript\";a.src=g;document.body&&document.body.appendChild(a)}};\r\nthis.start=function(){var t=this;\"complete\"!==document.readyState?window.addEventListener?window.addEventListener(\"load\",function(){t.go()},!1):window.attachEvent&&window.attachEvent(\"onload\",function(){t.go()}):t.go()};};\r\ntry{(new g(100,\"r\",\"QSI_S_ZN_5o5yqpvMVjgDOuN\",\"https://zn5o5yqpvmvjgdoun-wiley.siteintercept.qualtrics.com/SIE/?Q_ZID=ZN_5o5yqpvMVjgDOuN\")).start()}catch(i){}})();\r\n</script><div id='ZN_5o5yqpvMVjgDOuN'><!--DO NOT REMOVE-CONTENTS PLACED HERE--></div>\r\n<!--END WEBSITE FEEDBACK SNIPPET-->","enabled":false},{"pages":["all"],"location":"header","script":"<!-- Hotjar Tracking Code for http://www.dummies.com -->\r\n<script>\r\n (function(h,o,t,j,a,r){\r\n h.hj=h.hj||function(){(h.hj.q=h.hj.q||[]).push(arguments)};\r\n h._hjSettings={hjid:257151,hjsv:6};\r\n a=o.getElementsByTagName('head')[0];\r\n r=o.createElement('script');r.async=1;\r\n r.src=t+h._hjSettings.hjid+j+h._hjSettings.hjsv;\r\n a.appendChild(r);\r\n })(window,document,'https://static.hotjar.com/c/hotjar-','.js?sv=');\r\n</script>","enabled":false},{"pages":["article"],"location":"header","script":"<!-- //Connect Container: dummies --> <script src=\"//get.s-onetag.com/bffe21a1-6bb8-4928-9449-7beadb468dae/tag.min.js\" async defer></script>","enabled":true},{"pages":["homepage"],"location":"header","script":"<meta name=\"facebook-domain-verification\" content=\"irk8y0irxf718trg3uwwuexg6xpva0\" />","enabled":true},{"pages":["homepage","article","category","search"],"location":"footer","script":"<!-- Facebook Pixel Code -->\r\n<noscript>\r\n<img height=\"1\" width=\"1\" src=\"https://www.facebook.com/tr?id=256338321977984&ev=PageView&noscript=1\"/>\r\n</noscript>\r\n<!-- End Facebook Pixel Code -->","enabled":true}]}},"pageScriptsLoadedStatus":"success"},"navigationState":{"navigationCollections":[{"collectionId":287568,"title":"BYOB (Be Your Own Boss)","hasSubCategories":false,"url":"/collection/for-the-entry-level-entrepreneur-287568"},{"collectionId":293237,"title":"Be a Rad Dad","hasSubCategories":false,"url":"/collection/be-the-best-dad-293237"},{"collectionId":295890,"title":"Career Shifting","hasSubCategories":false,"url":"/collection/career-shifting-295890"},{"collectionId":294090,"title":"Contemplating the Cosmos","hasSubCategories":false,"url":"/collection/theres-something-about-space-294090"},{"collectionId":287563,"title":"For Those Seeking Peace of Mind","hasSubCategories":false,"url":"/collection/for-those-seeking-peace-of-mind-287563"},{"collectionId":287570,"title":"For the Aspiring Aficionado","hasSubCategories":false,"url":"/collection/for-the-bougielicious-287570"},{"collectionId":291903,"title":"For the Budding Cannabis Enthusiast","hasSubCategories":false,"url":"/collection/for-the-budding-cannabis-enthusiast-291903"},{"collectionId":299891,"title":"For the College Bound","hasSubCategories":false,"url":"/collection/for-the-college-bound-299891"},{"collectionId":291934,"title":"For the Exam-Season Crammer","hasSubCategories":false,"url":"/collection/for-the-exam-season-crammer-291934"},{"collectionId":301547,"title":"For the Game Day Prepper","hasSubCategories":false,"url":"/collection/big-game-day-prep-made-easy-301547"}],"navigationCollectionsLoadedStatus":"success","navigationCategories":{"books":{"0":{"data":[{"categoryId":33512,"title":"Technology","hasSubCategories":true,"url":"/category/books/technology-33512"},{"categoryId":33662,"title":"Academics & The Arts","hasSubCategories":true,"url":"/category/books/academics-the-arts-33662"},{"categoryId":33809,"title":"Home, Auto, & Hobbies","hasSubCategories":true,"url":"/category/books/home-auto-hobbies-33809"},{"categoryId":34038,"title":"Body, Mind, & Spirit","hasSubCategories":true,"url":"/category/books/body-mind-spirit-34038"},{"categoryId":34224,"title":"Business, Careers, & Money","hasSubCategories":true,"url":"/category/books/business-careers-money-34224"}],"breadcrumbs":[],"categoryTitle":"Level 0 Category","mainCategoryUrl":"/category/books/level-0-category-0"}},"articles":{"0":{"data":[{"categoryId":33512,"title":"Technology","hasSubCategories":true,"url":"/category/articles/technology-33512"},{"categoryId":33662,"title":"Academics & The Arts","hasSubCategories":true,"url":"/category/articles/academics-the-arts-33662"},{"categoryId":33809,"title":"Home, Auto, & Hobbies","hasSubCategories":true,"url":"/category/articles/home-auto-hobbies-33809"},{"categoryId":34038,"title":"Body, Mind, & Spirit","hasSubCategories":true,"url":"/category/articles/body-mind-spirit-34038"},{"categoryId":34224,"title":"Business, Careers, & Money","hasSubCategories":true,"url":"/category/articles/business-careers-money-34224"}],"breadcrumbs":[],"categoryTitle":"Level 0 Category","mainCategoryUrl":"/category/articles/level-0-category-0"}}},"navigationCategoriesLoadedStatus":"success"},"searchState":{"searchList":[],"searchStatus":"initial","relatedArticlesList":[],"relatedArticlesStatus":"initial"},"routeState":{"name":"ArticleCategory","path":"/category/articles/data-science-33577/","hash":"","query":{},"params":{"category":"data-science-33577"},"fullPath":"/category/articles/data-science-33577/","meta":{"routeType":"category","breadcrumbInfo":{"suffix":"Articles","baseRoute":"/category/articles"},"prerenderWithAsyncData":true},"from":{"name":null,"path":"/","hash":"","query":{},"params":{},"fullPath":"/","meta":{}}},"profileState":{"auth":{},"userOptions":{},"status":"success"}}
Logo
  • Articles Open Article Categories
  • Books Open Book Categories
  • Collections Open Collections list
  • Custom Solutions

Article Categories

Book Categories

Collections

Explore all collections
BYOB (Be Your Own Boss)
Be a Rad Dad
Career Shifting
Contemplating the Cosmos
For Those Seeking Peace of Mind
For the Aspiring Aficionado
For the Budding Cannabis Enthusiast
For the College Bound
For the Exam-Season Crammer
For the Game Day Prepper
Log In
  • Home
  • Technology Articles
  • Information Technology Articles
  • Data Science Articles

Data Science Articles

Data science is what happens when you let the world's brightest minds loose on a big dataset. It gets crazy. Our articles will walk you through what data science is and what it does.

Browse By Category

Big Data

Databases

General Data Science

Web Analytics

Previous slideNext slide

Big Data

Databases

General Data Science

Web Analytics

Articles From Data Science

page 1
page 2
page 3
page 4
page 5
page 6
page 7
page 8
page 9
page 10
page 11
page 12
page 13
page 14
page 15
page 16
page 17
page 18
page 19
page 20
page 21
page 22
page 23
page 24
page 25
page 26
page 27
page 28
page 29
page 30
page 31
page 32
page 33
page 34
page 35
page 36
page 37

Filter Results

369 results
369 results
General Data Science Linear Regression vs. Logistic Regression

Article / Updated 09-24-2024

Both linear and logistic regression see a lot of use in data science but are commonly used for different kinds of problems. You need to know and understand both types of regression to perform a full range of data science tasks. Of the two, logistic regression is harder to understand in many respects because it necessarily uses a more complex equation model. The following information gives you a basic overview of how linear and logistic regression differ. The equation model Any discussion of the difference between linear and logistic regression must start with the underlying equation model. The equation for linear regression is straightforward. y = a + bx You may see this equation in other forms and you may see it called ordinary least squares regression, but the essential concept is always the same. Depending on the source you use, some of the equations used to express logistic regression can become downright terrifying unless you’re a math major. However, the start of this discussion can use one of the simplest views of logistic regression: p = f(a + bx) >p, is equal to the logistic function, f, applied to two model parameters, a and b, and one explanatory variable, x. When you look at this particular model, you see that it really isn’t all that different from the linear regression model, except that you now feed the result of the linear regression through the logistic function to obtain the required curve. The output (dependent variable) is a probability ranging from 0 (not going to happen) to 1 (definitely will happen), or a categorization that says something is either part of the category or not part of the category. (You can also perform multiclass categorization, but focus on the binary response for now.) The best way to view the difference between linear regression output and logistic regression output is to say that the following: Linear regression is continuous. A continuous value can take any value within a specified interval (range) of values. For example, no matter how closely the height of two individuals matches, you can always find someone whose height fits between those two individuals. Examples of continuous values include: Height Weight Waist size Logistic regression is discrete. A discrete value has specific values that it can assume. For example, a hospital can admit only a specific number of patients in a given day. You can’t admit half a patient (at least, not alive). Examples of discrete values include: Number of people at the fair Number of jellybeans in the jar Colors of automobiles produced by a vendor The logistic function Of course, now you need to know about the logistic function. You can find a variety of forms of this function as well, but here’s the easiest one to understand: f(x) = e<sup>x</sup> / e<sup>x</sup> + 1 You already know about f, which is the logistic function, and x equals the algorithm you want to use, which is a + bx in this case. That leaves e, which is the natural logarithm and has an irrational value of 2.718, for the sake of discussion (check out a better approximation of the whole value). Another way you see this function expressed is f(x) = 1 / (1 + e<sup>-x</sup>) Both forms are correct, but the first form is easier to use. Consider a simple problem in which a, the y-intercept, is 0, and ">b, the slope, is 1. The example uses x values from –6 to 6. Consequently, the first f(x) value would look like this when calculated (all values are rounded): (1) e<sup>-6</sup> / (1 + e<sup>-6</sup>) (2) 0.00248 / 1 + 0.00248 (3) 0.002474 As you might expect, an xvalue of 0 would result in an f(x) value of 0.5, and an x value of 6 would result in an f(x) value of 0.9975. Obviously, a linear regression would show different results for precisely the same x values. If you calculate and plot all the results from both logistic and linear regression using the following code, you receive a plot like the one below. import matplotlib.pyplot as plt %matplotlib inline from math import exp x_values = range(-6, 7) lin_values = [(0 + 1*x) / 13 for x in range(0, 13)] log_values = [exp(0 + 1*x) / (1 + exp(0 + 1*x)) for x in x_values] plt.plot(x_values, lin_values, 'b-^') plt.plot(x_values, log_values, 'g-*') plt.legend(['Linear', 'Logistic']) plt.show() This example relies on list comprehension to calculate the values because it makes the calculations clearer. The linear regression uses a different numeric range because you must normalize the values to appear in the 0 to 1 range for comparison. This is also why you divide the calculated values by 13. The exp(x) call used for the logistic regression raises e to the power of x, e<sup>x</sup>, as needed for the logistic function. The model discussed here is simplified, and some math majors out there are probably throwing a temper tantrum of the most profound proportions right now. The Python or R package you use will actually take care of the math in the background, so really, what you need to know is how the math works at a basic level so that you can understand how to use the packages. This section provides what you need to use the packages. However, if you insist on carrying out the calculations the old way, chalk to chalkboard, you’ll likely need a lot more information. The problems that logistic regression solves You can separate logistic regression into several categories. The first is simple logistic regression, in which you have one dependent variable and one independent variable, much as you see in simple linear regression. However, because of how you calculate the logistic regression, you can expect only two kinds of output: Classification: Decides between two available outcomes, such as male or female, yes or no, or high or low. The outcome is dependent on which side of the line a particular data point falls. Probability: Determines the probability that something is true or false. The values true and false can have specific meanings. For example, you might want to know the probability that a particular apple will be yellow or red based on the presence of yellow and red apples in a bin. Fit the curve As part of understanding the difference between linear and logistic regression, consider this grade prediction problem, which lends itself well to linear regression. In the following code, you see the effect of trying to use logistic regression with that data: x1 = range(0,9) y1 = (0.25, 0.33, 0.41, 0.53, 0.59, 0.70, 0.78, 0.86, 0.98) plt.scatter(x1, y1, c='r') lin_values = [0.242 + 0.0933*x for x in x1] log_values = [exp(0.242 + .9033*x) / (1 + exp(0.242 + .9033*x)) for x in range(-4, 5)] plt.plot(x1, lin_values, 'b-^') plt.plot(x1, log_values, 'g-*') plt.legend(['Linear', 'Logistic', 'Org Data']) plt.show() The example has undergone a few changes to make it easier to see precisely what is happening. It relies on the same data that was converted from questions answered correctly on the exam to a percentage. If you have 100 questions and you answer 25 of them correctly, you have answered 25 percent (0.25) of them correctly. The values are normalized to produce values between 0 and 1 percent. As you can see from the image above, the linear regression follows the data points closely. The logistic regression doesn’t. However, logistic regression often is the correct choice when the data points naturally follow the logistic curve, which happens far more often than you might think. You must use the technique that fits your data best, which means using linear regression in this case. A pass/fail example An essential point to remember is that logistic regression works best for probability and classification. Consider that points on an exam ultimately predict passing or failing the course. If you get a certain percentage of the answers correct, you pass, but you fail otherwise. The following code considers the same data used for the example above, but converts it to a pass/fail list. When a student gets at least 70 percent of the questions correct, success is assured. y2 = [0 if x < 0.70 else 1 for x in y1] plt.scatter(x1, y2, c='r') lin_values = [0.242 + 0.0933*x for x in x1] log_values = [exp(0.242 + .9033*x) / (1 + exp(0.242 + .9033*x)) for x in range(-4, 5)] plt.plot(x1, lin_values, 'b-^') plt.plot(x1, log_values, 'g-*') plt.legend(['Linear', 'Logistic', 'Org Data']) plt.show() This is an example of how you can use list comprehensions in Python to obtain a required dataset or data transformation. The list comprehension for y2 starts with the continuous data in y1 and turns it into discrete data. Note that the example uses precisely the same equations as before. All that has changed is the manner in which you view the data, as you can see below. Because of the change in the data, linear regression is no longer the option to choose. Instead, you use logistic regression to fit the data. Take into account that this example really hasn’t done any sort of analysis to optimize the results. The logistic regression fits the data even better if you do so.

View Article
General Data Science Data Analytics & Visualization All-in-One Cheat Sheet

Cheat Sheet / Updated 04-12-2024

A wide range of tools is available that are designed to help big businesses and small take advantage of the data science revolution. Among the most essential of these tools are Microsoft Power BI, Tableau, SQL, and the R and Python programming languages.

View Cheat Sheet
Big Data Beyond Boundaries: Unstructured Data Orchestration

Article / Updated 12-01-2023

Getting the most out of your unstructured data is an essential task for any organization these days, especially when considering the disparate storage systems, applications, and user locations. So, it’s not an accident that data orchestration is the term that brings everything together. Bringing all your data together shares similarities with conducting an orchestra. Instead of combining the violin, oboe, and cello, this brand of orchestration combines distributed data types from different places, platforms, and locations working as a cohesive entity presented to applications or users anywhere. That’s because historically, accessing high-performance data outside of your computer network was inefficient. Because the storage infrastructure existed in a silo, systems like HPC Parallel (which lets users store and access shared data across multiple networked storage nodes), Enterprise NAS (which allows large-scale storage and access to other networks), and Global Namespace (virtually simplifies network file systems) were limited when it came to sharing. Because each operated independently, the data within each system was siloed making it a problem collaborating with data sets over multiple locations. Collaboration was possible, but too often you lost the ability to have high performance. This Boolean logic decreased potential because having an IT architecture that supported both high performance and collaboration with data sets from different storage silos typically became an either/or decision: You were forced to choose one but never both. What is data orchestration? Data orchestration is the automated process of taking siloed data from multiple data storage systems and locations, combining and organizing it into a single namespace. Then a high-performance file system can place data in the edge service, data center, or cloud service most optimal for the workload. The recent rise of data analytic applications and artificial intelligence (AI) capabilities has accelerated the use of data across different locations and even different organizations. In the next data cycle, organizations will need both high-performance and agility with their data to compete and thrive in a competitive environment. That means data no longer has a 1:1 relationship with the applications and compute environment that generated it. It needs to be used, analyzed, and repurposed with different AI models and alternate workloads, and across a remote, collaborative environment. Hammerspace’s technology makes data available to different foundational models, remote applications, decentralized compute clusters, and remote workers to automate and streamline data-driven development programs, data insights, and business decision making. This capability enables a unified, fast, and efficient global data environment for the entire workflow — from data creation to processing, collaboration, and archiving across edge devices, data centers, and public and private clouds. Control of enterprise data services for governance, security, data protection, and compliance can now be implemented globally at a file-granular level across all storage types and locations. Applications and AI models can access data stored in remote locations while using automated orchestration tools to provide high-performance local access when needed for processing. Organizations can grow their talent pools with access to team members no matter where they reside. Decentralizing the data center Data collection has become more prominent, and the traditional system of centralized data management has limitations. Issues of centralized data storage can limit the amount of data available to applications. Then, there are the high infrastructure costs when multiple applications are needed to manage and move data, multiple copies of data are retained in different storage systems, and more headcount is needed to manage the complex, disconnected infrastructure environment. Such setbacks suggest that the data center is no longer the center of data and storage system constraints should no longer define data architectures. Hammerspace specializes in decentralized environments, where data may need to span two or more sites and possibly one or more cloud providers and regions, and/or where a remote workforce needs to collaborate in real time. It enables a global data environment by providing a unified, parallel global file system. Enabling a global data environment Hammerspace completely revolutionizes previously held notions of how unstructured data architectures should be designed, delivering the performance needed across distributed environments to Free workloads from data silos. Eliminate copy proliferation. Provide direct data access through local metadata to applications and users, no matter where the data is stored. This technology allows organizations to take full advantage of the performance capabilities of any server, storage system, and network anywhere in the world. This capability enables a unified, fast, and efficient global data environment for the entire workflow, from data creation to processing, collaboration, and archiving across edge devices, data centers, and public and private clouds. The days of enterprises struggling with a siloed, distributed, and inefficient data environment are over. It’s time to start expecting more from data architectures with automated data orchestration. Find out how by downloading Unstructured Data Orchestration For Dummies, Hammerspace Special Edition, here.

View Article
General Data Science E-Commerce and Data Testing Tactics

Article / Updated 07-27-2023

In growth, you use testing methods to optimize your web design and messaging so that it performs at its absolute best with the audiences to which it's targeted. Although testing and web analytics methods are both intended to optimize performance, testing goes one layer deeper than web analytics. You use web analytics to get a general idea about the interests of your channel audiences and how well your marketing efforts are paying off over time. After you have this information, you can then go in deeper to test variations on live visitors in order to gain empirical evidence about what designs and messaging your visitors actually prefer. Testing tactics can help you optimize your website design or brand messaging for increased conversions in all layers of the funnel. Testing is also useful when optimizing your landing pages for user activations and revenue conversions. Checking out common types of testing in growth When you use data insights to increase growth for e-commerce businesses, you're likely to run into the three following testing tactics: A/B split testing, multivariate testing, and mouse-click heat map analytics. An A/B split test is an optimization tactic you can use to split variations of your website or brand messaging between sets of live audiences in order to gauge responses and decide which of the two variations performs best. A/B split testing is the simplest testing method you can use for website or messaging optimization. Multivariate testing is, in many ways, similar to the multivariate regression analysis that I discuss in Chapter 5. Like multivariate regression analysis, multivariate testing allows you to uncover relationships, correlations, and causations between variables and outcomes. In the case of multivariate testing, you're testing several conversion factors simultaneously over an extended period in order to uncover which factors are responsible for increased conversions. Multivariate testing is more complicated than A/B split testing, but it usually provides quicker and more powerful results. Lastly, you can use mouse-click heat map analytics to see how visitors are responding to your design and messaging choices. In this type of testing, you use the mouse-click heat map to help you make optimal website design and messaging choices to ensure that you're doing everything you can to keep your visitors focused and converting. Landing pages are meant to offer visitors little to no options, except to convert or to exit the page. Because a visitor has so few options on what he can do on a landing page, you don't really need to use multivariate testing or website mouse-click heat maps. Simple A/B split tests suffice. Data scientists working in growth hacking should be familiar with (and know how to derive insight from) the following testing applications: Webtrends: Offers a conversion-optimization feature that includes functionality for A/B split testing and multivariate testing. Optimizely: A popular product among the growth-hacking community. You can use Optimizely for multipage funnel testing, A/B split testing, and multivariate testing, among other things. Visual Website Optimizer: An excellent tool for A/B split testing and multivariate testing. Testing for acquisitions Acquisitions testing provides feedback on how well your content performs with prospective users in your assorted channels. You can use acquisitions testing to help compare your message's performance in each channel, helping you optimize your messaging on a per-channel basis. If you want to optimize the performance of your brand's published images, you can use acquisition testing to compare image performance across your channels as well. Lastly, if you want to increase your acquisitions through increases in user referrals, use testing to help optimize your referrals messaging for the referrals channels. Acquisition testing can help you begin to understand the specific preferences of prospective users on a channel-by-channel basis. You can use A/B split testing to improve your acquisitions in the following ways: Social messaging optimization: After you use social analytics to deduce the general interests and preferences of users in each of your social channels, you can then further optimize your brand messaging along those channels by using A/B split testing to compare your headlines and social media messaging within each channel. Brand image and messaging optimization: Compare and optimize the respective performances of images along each of your social channels. Optimized referral messaging: Test the effectiveness of your email messaging at converting new user referrals. Testing for activations Activation testing provides feedback on how well your website and its content perform in converting acquired users to active users. The results of activation testing can help you optimize your website and landing pages for maximum sign-ups and subscriptions. Here's how you'd use testing methods to optimize user activation growth: Website conversion optimization: Make sure your website is optimized for user activation conversions. You can use A/B split testing, multivariate testing, or a mouse-click heat map data visualization to help you optimize your website design. Landing pages: If your landing page has a simple call to action that prompts guests to subscribe to your email list, you can use A/B split testing for simple design optimization of this page and the call-to-action messaging. Testing for retentions Retentions testing provides feedback on how well your blog post and email headlines are performing among your base of activated users. If you want to optimize your headlines so that active users want to continue active engagements with your brand, test the performance of your user-retention tactics. Here's how you can use testing methods to optimize user retention growth: Headline optimization: Use A/B split testing to optimize the headlines of your blog posts and email marketing messages. Test different headline varieties within your different channels, and then use the varieties that perform the best. Email open rates and RSS view rates are ideal metrics to track the performance of each headline variation. Conversion rate optimization: Use A/B split testing on the messaging within your emails to decide which messaging variety more effectively gets your activated users to engage with your brand. The more effective your email messaging is at getting activated users to take a desired action, the greater your user retention rates. Testing for revenue growth Revenue testing gauges the performance of revenue-generating landing pages, e-commerce pages, and brand messaging. Revenue testing methods can help you optimize your landing and e-commerce pages for sales conversions. Here's how you can use testing methods to optimize revenue growth: Website conversion optimization: You can use A/B split testing, multivariate testing, or a mouse-click heat map data visualization to help optimize your sales page and shopping cart design for revenue-generating conversions. Landing page optimization: If you have a landing page with a simple call to action that prompts guests to make a purchase, you can use A/B split testing for design optimization.

View Article
General Data Science Blockchain Data Analytics For Dummies Cheat Sheet

Cheat Sheet / Updated 07-24-2023

Blockchain technology is much more than just another way to store data. It's a radical new method of storing validated data and transaction information in an indelible, trusted repository. Blockchain has the potential to disrupt business as we know it, and in the process, provide a rich new source of behavioral data. Data analysts have long found valuable insights from historical data, and blockchain can expose new and reliable data to drive business strategy. To best leverage the value that blockchain data offers, become familiar with blockchain technology and how it stores data, and learn how to extract and analyze this data.

View Cheat Sheet
General Data Science The Primary Types of Blockchain

Article / Updated 07-24-2023

In 2008, Bitcoin was the only blockchain implementation. At that time, Bitcoin and blockchain were synonymous. Now hundreds of different blockchain implementations exist. Each new blockchain implementation emerges to address a particular need and each one is unique. However, blockchains tend to share many features with other blockchains. Before examining blockchain applications and data, it helps to look at their similarities. Check out this article to learn how blockchains work. Categorizing blockchain implementations One of the most common ways to evaluate blockchains is to consider the underlying data visibility, that is, who can see and access the blockchain data. And just as important, who can participate in the decision (consensus) to add new blocks to the blockchain? The three primary blockchain models are public, private, and hybrid. Opening blockchain to everyone Nakamoto’s original blockchain proposal described a public blockchain. After all, blockchain technology is all about providing trusted transactions among untrusted participants. Sharing a ledger of transactions among nodes in a public network provides a classic untrusted network. If anyone can join the network, you have no criteria on which to base your trust. It’s almost like throwing s $20 bill out your window and trusting that only the person you intend to pick it up will do so. Public blockchain implementations, including Bitcoin and Ethereum, depend on a consensus algorithm that makes it hard to mine blocks but easy to validate them. PoW is the most common consensus algorithm in use today for public blockchains, but that may change. Ethereum is in the process of transitioning to the Proof of Stake (PoS) consensus algorithm, which requires less computation and depends on how much blockchain currency a node holds. The idea is that a node with more blockchain currency would be affected negatively if it participates in unethical behavior. The higher the stake you have in something, the greater the chance that you’ll care about its integrity. Because public blockchains are open to anyone (anyone can become a node on the network), no permission is needed to join. For this reason, a public blockchain is also called a permissionless blockchain. Public (permissionless) blockchains are most often used for new apps that interact with the public in general. A public blockchain is like a retail store, in that anyone can walk into the store and shop. Limiting blockchain access The opposite of a public blockchain is a private blockchain, such as Hyperledger Fabric. In a private blockchain, also called a permissioned blockchain, the entity that owns and controls the blockchain grants and revokes access to the blockchain data. Because most enterprises manage sensitive or private data, private blockchains are commonly used because they can limit access to that data. The blockchain data is still transparent and readily available but is subject to the owning entity’s access requirements. Some have argued that private blockchains violate data transparency, the original intent of blockchain technology. Although private blockchains can limit data access (and go against the philosophy of the original blockchain in Bitcoin), limited transparency also allows enterprises to consider blockchain technology for new apps in a private environment. Without the private blockchain option, the technology likely would never be considered for most enterprise applications. Combining the best of both worlds A classic blockchain use case is a supply chain app, which manages a product from its production all the way through its consumption. The beginning of the supply chain is when a product is manufactured, harvested, caught, or otherwise provisioned to send to an eventual customer. The supply chain app then tracks and manages each transfer of ownership as the product makes its way to the physical location where the consumer purchases it. Supply chain apps manage product movement, process payment at each stage in the movement lifecycle, and create an audit trail that can be used to investigate the actions of each owner along the supply chain. Blockchain technology is well suited to support the transfer of ownership and maintain an indelible record of each step in the process. Many supply chains are complex and consist of multiple organizations. In such cases, data suffers as it is exported from one participant, transmitted to the next participant, and then imported into their data system. A single blockchain would simplify the export/transport/import cycle and auditing. An additional benefit of blockchain technology in supply chain apps is the ease with which a product’s provenance (a trace of owners back to its origin) is readily available. Many of today’s supply chains are made up of several enterprises that enter into agreements to work together for mutual benefit. Although the participants in a supply chain are business partners, they do not fully trust one another. A blockchain can provide the level of transactional and data trust that the enterprises need. The best solution is a semi-private blockchain – that is, the blockchain is public for supply chain participants but not to anyone else. This type of blockchain (one that is owned by a group of entities) is called a hybrid, or consortium, blockchain. The participants jointly own the blockchain and agree on policies to govern access. Describing basic blockchain type features Each type of blockchain has specific strengths and weaknesses. Which one to use depends on the goals and target environment. You have to know why you need blockchain and what you expect to get from it before you can make an informed decision as to what type of blockchain would be best. The best solution for one organization may not be the best solution for another. The table below shows how blockchain types compare and why you might choose one over the other. Differences in Types of Blockchain Feature Public Private Hybrid Permission Permissionless Permissioned (limited to organization members) Permissioned (limited to consortium members) Consensus PoW, PoS, and so on Authorized participants Varies; can use any method Performance Slow (due to consensus) Fast (relatively) Generally fast Identity Virtually anonymous Validated identity Validated identity The primary differences between each type of blockchain are the consensus algorithm used and whether participants are known or anonymous. These two concepts are related. An unknown (and therefore completely untrusted) participant will require an environment with a more rigorous consensus algorithm. On the other hand, if you know the transaction participants, you can use a less rigorous consensus algorithm. Contrasting popular enterprise blockchain implementations Dozens of blockchain implementations are available today, and soon there will be hundreds. Each new blockchain implementation targets a specific market and offers unique features. There isn’t room in this article to cover even a fair number of blockchain implementations, but you should be aware of some of the most popular. Remember that you’ll be learning about blockchain analytics in this book. Although organizations of all sizes are starting to leverage the power of analytics, enterprises were early adopters and have the most mature approach to extracting value from data. The What Matrix website provides a comprehensive comparison of top enterprise blockchains. Visit whatmatrix.com for up-to-date blockchain information. Following are the top enterprise blockchain implementations and some of their strengths and weaknesses (ranking is based on the What Matrix website): Hyperledger Fabric: The flagship blockchain implementation from the Linux Foundation. Hyperledger is an open-source project backed by a diverse consortium of large corporations. Hyperledger’s modular-based architecture and rich support make it the highest rated enterprise blockchain. VeChain: Currently more popular that Hyperledger, having the highest number of enterprise use cases among products reviewed by What Matrix. VeChain includes support for two native cryptocurrencies and states that its focus is on efficient enterprise collaboration. Ripple Transaction Protocol: A blockchain that focuses on financial markets. Instead of appealing to general use cases, Ripple caters to organizations that want to implement financial transaction blockchain apps. Ripple was the first commercially available blockchain focused on financial solutions. Ethereum: The most popular general-purpose, public blockchain implementation. Although Ethereum is not technically an enterprise solution, it's in use in multiple proof of concept projects. The preceding list is just a brief overview of a small sample of blockchain implementations. If you’re just beginning to learn about blockchain technology in general, start out with Ethereum, which is one of the easier blockchain implementations to learn. After that, you can progress to another blockchain that may be better aligned with your organization. Want to learn more? Check out our Blockchain Data Analytic Cheat Sheet.

View Article
General Data Science Looking at the Basics of Statistics, Machine Learning, and Mathematical Methods in Data Science

Article / Updated 06-09-2023

If statistics has been described as the science of deriving insights from data, then what’s the difference between a statistician and a data scientist? Good question! While many tasks in data science require a fair bit of statistical know how, the scope and breadth of a data scientist’s knowledge and skill base is distinct from those of a statistician. The core distinctions are outlined below. Subject matter expertise: One of the core features of data scientists is that they offer a sophisticated degree of expertise in the area to which they apply their analytical methods. Data scientists need this so that they’re able to truly understand the implications and applications of the data insights they generate. A data scientist should have enough subject matter expertise to be able to identify the significance of their findings and independently decide how to proceed in the analysis. In contrast, statisticians usually have an incredibly deep knowledge of statistics, but very little expertise in the subject matters to which they apply statistical methods. Most of the time, statisticians are required to consult with external subject matter experts to truly get a firm grasp on the significance of their findings, and to be able to decide the best way to move forward in an analysis. Mathematical and machine learning approaches: Statisticians rely mostly on statistical methods and processes when deriving insights from data. In contrast, data scientists are required to pull from a wide variety of techniques to derive data insights. These include statistical methods, but also include approaches that are not based in statistics — like those found in mathematics, clustering, classification, and non-statistical machine learning approaches. Seeing the importance of statistical know-how You don't need to go out and get a degree in statistics to practice data science, but you should at least get familiar with some of the more fundamental methods that are used in statistical data analysis. These include: Linear regression: Linear regression is useful for modeling the relationships between a dependent variable and one or several independent variables. The purpose of linear regression is to discover (and quantify the strength of) important correlations between dependent and independent variables. Time-series analysis: Time series analysis involves analyzing a collection of data on attribute values over time, in order to predict future instances of the measure based on the past observational data. Monte Carlo simulations: The Monte Carlo method is a simulation technique you can use to test hypotheses, to generate parameter estimates, to predict scenario outcomes, and to validate models. The method is powerful because it can be used to very quickly simulate anywhere from 1 to 10,000 (or more) simulation samples for any processes you are trying to evaluate. Statistics for spatial data: One fundamental and important property of spatial data is that it’s not random. It’s spatially dependent and autocorrelated. When modeling spatial data, avoid statistical methods that assume your data is random. Kriging and krige are two statistical methods that you can use to model spatial data. These methods enable you to produce predictive surfaces for entire study areas based on sets of known points in geographic space. Working with clustering, classification, and machine learning methods Machine learning is the application of computational algorithms to learn from (or deduce patterns in) raw datasets. Clustering is a particular type of machine learning —unsupervised machine learning, to be precise, meaning that the algorithms must learn from unlabeled data, and as such, they must use inferential methods to discover correlations. Classification, on the other hand, is called supervised machine learning, meaning that the algorithms learn from labeled data. The following descriptions introduce some of the more basic clustering and classification approaches: k-means clustering: You generally deploy k-means algorithms to subdivide data points of a dataset into clusters based on nearest mean values. To determine the optimal division of your data points into clusters, such that the distance between points in each cluster is minimized, you can use k-means clustering. Nearest neighbor algorithms: The purpose of a nearest neighbor analysis is to search for and locate either a nearest point in space or a nearest numerical value, depending on the attribute you use for the basis of comparison. Kernel density estimation: An alternative way to identify clusters in your data is to use a density smoothing function. Kernel density estimation (KDE) works by placing a kernel a weighting function that is useful for quantifying density — on each data point in the data set, and then summing the kernels to generate a kernel density estimate for the overall region. Keeping mathematical methods in the mix Lots gets said about the value of statistics in the practice of data science, but applied mathematical methods are seldom mentioned. To be frank, mathematics is the basis of all quantitative analyses. Its importance should not be understated. The two following mathematical methods are particularly useful in data science. Multi-criteria decision making (MCDM): MCDM is a mathematical decision modeling approach that you can use when you have several criteria or alternatives that you must simultaneously evaluate when making a decision. Markov chains: A Markov chain is a mathematical method that chains together a series of randomly generated variables that represent the present state in order to model how changes in present state variables affect future states.

View Article
General Data Science An Intro to Aligning Blockchain Data Analytics with Business Goals

Article / Updated 06-09-2023

Blockchain technology alone cannot provide rich analytics results. For all that blockchain is, it can’t magically provide more data than other technologies. Before selecting blockchain technology for any new development or analytics project, clearly justify why such a decision makes sense. If you already depend on blockchain technology to store data, the decision to use that data for analysis is a lot easier to justify. Here, you examine some reasons why blockchain-supported analytics may allow you to leverage your data in interesting ways. Leveraging newly accessible decentralized tools to analyze blockchain data You’ll want to learn how to manually access and analyze blockchain data. But, it's also important to understand how to exercise granular control over your data throughout the analytics process, higher-level tools make the task easier. The growing number of decentralized data analytics solutions means more opportunities to build analytics models with less effort. Third-party tools may reduce the amount of control you have over the models you deploy, but they can dramatically increase analytics productivity. The following list of blockchain analytics solutions is not exhaustive and is likely to change rapidly. Take a few minutes to conduct your own internet search for blockchain analytics tools. You’ll likely find even more software and services: Endor: A blockchain-based AI prediction platform that has the goal of making the technology accessible to organizations of all sizes. Endor is both a blockchain analytics protocol and a prediction engine that integrates on-chain and off-chain data for analysis. Crystal: A blockchain analytics platform that integrates with the Bitcoin and Ethereum blockchains and focuses on cryptocurrency transaction analytics. Different Crystal products cater to small organizations, enterprises, and law enforcement agencies. OXT: The most focused of the three products listed, OXT is an analytics and visualization explorer tool for the Bitcoin blockchain. Although OXT doesn’t provide analytics support for a variety of blockchains, it attempts to provide a wide range of analytics options for Bitcoin. Monetizing blockchain data Today’s economy is driven by data, and the amount of data being collected about individuals and their behavior is staggering. Think of the last time you accessed your favorite shopping site. Chances are, you saw an ad that you found relevant. Those targeted ads seem to be getting better and better at figuring out what would interest you. The capability to align ads with user preferences depends on an analytics engine acquiring enough data about the user to reliably predict products or services of interest. Blockchain data can represent the next logical phase of data’s value to the enterprise. As more and more consumers realize the value of their personal data, interest is growing in the capability to control that data. Consumers now want to control how their data is being used and demand incentives or compensation for the use of their data. Blockchain technology can provide a central point of presence for personal data and the ability for the data’s owner to authorize access to that data. Removing personal data from common central data stores, such as Google and Facebook, has the potential to revolutionize marketing and advertising. Smaller organizations could access valuable marketing information by asking permission from the data owner as opposed to the large data aggregators. Circumventing big players such as Google and Facebook could reduce marketing costs and allow incentives to flow directly to individuals. There is a long way to go to move away from current personal data usage practices, but blockchain technology makes it possible. This process may be accelerated by emerging regulations that protect individual rights to control private data. For example, the European Union’s General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) both strengthen an individual’s ability to control access to, and use of, their personal data. Exchanging and integrating blockchain data effectively Much of the value of blockchain data is in its capability to relate to off-chain data. Most blockchain apps refer to some data stored in off-chain repositories. It doesn’t make sense to store every type of data in a blockchain. Reference data, which is commonly data that gets updated to reflect changing conditions, may not be good candidates for storing in a blockchain. Blockchain technology excels at recording value transfers between owners. All applications define and maintain additional information that supports and provides details for transactions but doesn’t directly participate in transactions. Such information, such as product description or customer notes, may make more sense to store in an off-chain repository. Any time blockchain apps rely on on-chain and off-chain data, integration methods become a concern. Even if your app uses only on-chain data, it is likely that analytics models will integrate with off-chain data. For example, owners in blockchain environments are identified by addresses. These addresses have no context external to the blockchain. Any association between an address and a real-world identity is likely stored in an off-chain repository. Another example of the need for off-chain data is when analyzing aircraft safety trends. Perhaps your analysis correlates blockchain-based incident and accident data with weather conditions. Although each blockchain transaction contains a timestamp, you’d have to consult an external weather database to determine prevailing weather conditions at the time of the transaction. Many examples of the need to integrate off-chain data with on-chain transactions exist. Part of the data acquisition phase of any analytics project is to identify data sources and access methods. In a blockchain analytics project, that process means identifying off-chain data you need to satisfy the goals of your project and how to get that data. Want to learn more? Check out our Blockchain Data Analytics Cheat Sheet.

View Article
General Data Science Tableau For Dummies Cheat Sheet

Cheat Sheet / Updated 06-05-2023

Tableau is not a single application but rather a collection of applications that create a best-in-class business intelligence platform. You may want to dive right in and start trying to create magnificent visualizations, but there are a few concepts you should know about to refine your data and optimize visualizations. You’ll need to determine whether your data set requires data cleansing. In that case, you’ll utilize Tableau Prep. If you want to collaborate and share your data, reports, and visualizations, you’ll use either Tableau Cloud or Tableau Server. Central to the Tableau solution suite is Tableau Desktop; it’s at the heart of the creative engine for virtually all users at some point in time to create visualization renderings from workbooks, dashboards, and stories. Keep reading for tips about data layout and cleansing data in Tableau Prep.

View Cheat Sheet
General Data Science 10 Uses for Blockchain Analytics

Article / Updated 08-04-2022

A common question from management when first considering data analytics and again in the specific context of blockchain is “Why do we need this?” Your organization will have to answer that question, in general, and you’ll need to explain why building and executing analytics models on your blockchain data will benefit your organization. Without an expected return on investment (ROI), management probably won't authorize and fund any analytics efforts. The good news is that you aren’t the pioneer in blockchain analytics. Other organizations of all sizes have seen the value of formal analysis of blockchain data. Examining what other organizations have done can be encouraging and insightful. You’ll probably find some fresh ideas as you familiarize yourself with what others have accomplished with their blockchain analytics projects. Here, you learn about ten ways in which blockchain analytics can be useful to today’s (and tomorrow’s) organizations. Blockchain analytics focuses on analyzing what happened in the past, explaining what's happening now, and even preparing for what's expected to come in the future. Analytics can help any organization react, understand, prepare, and lower overall risk. Accessing public financial transaction data The first blockchain implementation, Bitcoin, is all about cryptocurrency, so it stands to reason that examining financial transactions would be an obvious use of blockchain analytics. If tracking transactions was your first thought of how to use blockchain analytics, you’d be right. Bitcoin and other blockchain cryptocurrencies used to be viewed as completely anonymous methods of executing financial transactions. The flawed perception of complete anonymity enticed criminals to use the new type of currency to conduct illegal business. Since cryptocurrency accounts aren’t directly associated with real-world identities (at least on the blockchain), any users who wanted to conduct secret business warmed up to Bitcoin and other cryptocurrencies. When law enforcement noticed the growth in cryptocurrency transactions, they began looking for ways to re-identify transactions of interest. It turns out that with a little effort and proper legal authority, it isn’t that hard to figure out who owns a cryptocurrency account. When a cryptocurrency account is converted and transferred to a traditional account, many criminals are unmasked. Law enforcement became an early adopter of blockchain analytics and still uses models today to help identify suspected criminal and fraudulent activity. Chainalysis is a company that specializes in cryptocurrency investigations. Their product, Chainalysis Reactor, allows users to conduct cryptocurrency forensics to connect transactions to real-world identities. The image shows the Chainalysis Reactor tool. But blockchain technology isn’t just for criminals, and blockchain analytics isn’t just to catch bad guys. The growing popularity of blockchain and cryptocurrencies could lead to new ways to evaluate entire industries, P2P transactions, currency flow, the wealth of nation-states, and a variety of other market valuations with this new area of analysis. For example, Ethereum has emerged as a major avenue of fundraising for tech startups, and its analysis could lend a deeper look into the industry. Connecting with the Internet of Things (IoT) The Internet of Things (IoT) is loosely defined as the collection of devices of all sizes that are connected to the internet and operate at some level with little human interaction. IoT devices include doorbell cameras, remote temperature sensors, undersea oil leak detectors, refrigerators, and vehicle components. The list is almost endless, as is the number of devices connecting to the internet. Each IoT device has a unique identity and produces and consumes data. All of these devices need some entity that manages data exchange and the device’s operation. Although most IoT devices are autonomous (they operate without the need for external guidance), all devices eventually need to request or send data to someone. But that someone doesn’t have to be a human. Currently, the centralized nature of traditional IoT systems reduces their scalability and can create bottlenecks. A central management entity can handle only a limited number of devices. Many companies working in the IoT space are looking to leverage the smart contracts in blockchain networks to allow IoT devices to work more securely and autonomously. These smart contracts are becoming increasingly attractive as the number of IoT devices exceeds 20 billion worldwide in 2020. The figure below shows how IoT has matured from a purely centralized network in the past to a distributed network (which still had some central hubs) to a vision of the future without the need for central managers. The applications of IoT data are endless, and if the industry does shift in this direction, knowing and understanding blockchain analytics will be necessary to truly unlock its potential. Using blockchain technology to manage IoT devices is only the beginning. Without the application of analytics to really understand the huge volume of data IoT devices will be generating, much of the value of having so many autonomous devices will be lost. Ensuring data and document authenticity The Lenovo Group is a multinational technology company that manufactures and distributes consumer electronics. During a business process review, Lenovo identified several areas of inefficiency in their supply chain. After analyzing the issues, they decided to incorporate blockchain technology to increase visibility, consistency, and autonomy, and to decrease waste and process delays. Lenovo published a paper, “Blockchain Technology for Business: A Lenovo Point of View,” detailing their efforts and results. In addition to describing their supply chain application of blockchain technology in their paper, Lenovo cited examples of how the New York Times uses blockchain to prove that photos are authentic. They also described how the city of Dubai is working to have all its government documents on blockchain by the end of 2020 in an effort to crack down on corruption and the misuse of funds. In the era of deep fakes, manipulated photos and consistently evolving methods of corruption and misappropriation of funds, blockchain can help identify cases of data fraud and misuse. Blockchain’s inherent transparency and immutability means that data cannot be retroactively manipulated to support a narrative. Facts in a blockchain are recorded as unchangeable facts. Analytics models can help researchers understand how data of any type originated, who the original owner was, how it gets amended over time, and if any amendments are coordinated. Controlling secure document integrity As just mentioned, blockchain technology can be used to ensure document authenticity, but it can be used also to ensure document integrity. In areas where documents should not be able to be altered, such as the legal and healthcare industries, blockchain can help make documents and changes to them transparent and immutable, as well as increase the power the owner of the data has to control and manage it. Documents do not have to be stored in the blockchain to benefit from the technology. Documents can be stored in off-chain repositories, with a hash stored in a block on the blockchain. Each transaction (required to write to a new block) contains the owner’s account and a timestamp of the action. The integrity of any document at a specific point in time can be validated simply by comparing the on-chain hash with the calculated hash value of the document. If the hash values match, the document has not changed since the blockchain transaction was created. The company DocStamp has implemented a novel use for blockchain document management. Using DocStamp, shown below, anyone can self-notarize any document. The document owner maintains control of the document while storing a hash of the document on an Ethereum blockchain. Services such as DocStamp provide the capability to ensure document integrity using blockchain technology. However, assessing document integrity and its use is up to analytics models. The DocStamp model is not generally recognized by courts of law to be as strong as a traditional notary. For that to change, analysts will need to provide model results that show how the approach works and how blockchain can help provide evidence that document integrity is ensured. Tracking supply chain items In the Lenovo blockchain paper, the author described how Lenovo replaced printed paperwork in its supply chain with processes managed through smart contracts. The switch to blockchain-based process management greatly decreased the potential for human error and removed many human-related process delays. Replacing human interaction with electronic transaction increased auditability and gave all parties more transparency in the movement of goods. The Lenovo supply chain became more efficient and easier to investigate. Blockchain-based supply chain solutions are one of the most popular ways to implement blockchain technology. Blockchain technology makes it easy to track items along the supply chain, both forward and backward. The capability to track an item makes it easy to determine where an item is and where that item has been. Tracing an item’s provenance, or origin, makes root cause analysis possible. Because the blockchain keeps all history of movement through the supply chain, many types of analysis are easier than traditional data stores which can overwrite data. The US Food and Drug Administration is working with several private firms to evaluate using blockchain technology supply chain applications to identify, track, and trace prescription drugs. Analysis of the blockchain data can provide evidence for identifying counterfeit drugs and delivery paths criminals use to get those drugs to market. Empowering predictive analytics You can build several models that allow you to predict future behavior based on past observations. Predictive analytics is often one of the goals of an organization’s analytics projects. Large organizations may already have a collection of data that supports prediction. Smaller organizations, however, probably lack enough data to make accurate predictions. Even large organizations would still benefit from datasets that extend beyond their own customers and partners. In the past, a common approach to acquiring enough data for meaningful analysis was to purchase data from an aggregator. Each data acquisition request costs money, and the data you receive may still be limited in scope. The prospect of using public blockchains has the potential to change the way we all access public data. If a majority of supply chain interactions, for example, use a public blockchain, that data is available to anyone — for free. As more organizations incorporate blockchains into their operations, analysts could leverage the additional data to empower more companies to use predictive analytics with less reliance on localized data. Analyzing real-time data Blockchain transactions happen in real time, across intranational and international borders. Not only are banks and innovators in financial technology pursuing blockchain for the speed it offers to transactions, but data scientists and analysts are observing blockchain data changes and additions in real time, greatly increasing the potential for fast decision-making. To view how dynamic blockchain data really is, visit the Ethviewer Ethereum blockchain monitor’s website. The following image shows the Ethviewer website. Each small circle in the blob near the lower-left corner of the web page is a distinct transaction waiting to make it into a new block. You can see how dynamic the Ethereum blockchain is — it changes constantly. And when the blockchain changes, so does the blockchain data that your models use to provide accurate results. Supercharging business strategy Companies big and small — marketing firms, financial technology giants, small local retailers, and many more — can fine-tune their strategies to keep up with, and even get ahead of, shifts in the market, the economy, and their customer base. How? By utilizing the results of analytics models built on the organization’s blockchain data. The ultimate goal for any analytics project is to provide ROI for the sponsoring organization. Blockchain analytics projects provide a unique opportunity to provide value. New blockchain implementations are only recently becoming common in organizations, and now is the time to view those sources of data as new opportunities to provide value. Analytics can help identify potential sources of ROI. Managing data sharing Blockchain technology is often referred to as a disruptive technology, and there is some truth to that characterization. Blockchain does disrupt many things. In the context of data analytics, blockchain changes the way analysts acquire at least some of their data. If a public or consortium blockchain is the source for an analytics model, it's a near certainty that the sponsoring organization does not own all the data. Much of the data in a non-private blockchain comes from other entities that decided to place the data in a shared repository, the blockchain. Blockchain can aid in the storage of data in a distributed network and make that data easily accessible to project teams. Easy access to data makes the whole analytics process easier. There still may be a lot of work to do, but you can always count on the facts that blockchain data is accessible and it hasn’t changed since it was written. Blockchain makes collaboration among data analysts and other data consumers easier than with more traditional data repositories. Standardizing collaboration forms Blockchain technology empowers analytics in more ways than just providing access to more data. Regardless of whether blockchain technology is deployed in the healthcare, legal, government, or other organizational domain, blockchain can lead to more efficient process automation. Also, blockchain’s revolutionary approach to how data is generated and shared among parties can lead to better and greater standardization in how end users populate forms and how other data gets collected. Blockchains can help encourage adherence to agreed-upon standards for data handling. The use of data-handling standards will greatly decrease the amount of time necessary for data cleaning and management. Because cleansing data commonly requires a large time investment in the analytics process, standardization through the use of blockchain can make it easier to build and modify models with a short time-to-market.

View Article
page 1
page 2
page 3
page 4
page 5
page 6
page 7
page 8
page 9
page 10
page 11
page 12
page 13
page 14
page 15
page 16
page 17
page 18
page 19
page 20
page 21
page 22
page 23
page 24
page 25
page 26
page 27
page 28
page 29
page 30
page 31
page 32
page 33
page 34
page 35
page 36
page 37

Quick Links

  • About For Dummies
  • Contact Us
  • Activate Online Content

Connect

About Dummies

Dummies has always stood for taking on complex concepts and making them easy to understand. Dummies helps everyone be more knowledgeable and confident in applying what they know. Whether it's to pass that big test, qualify for that big promotion or even master that cooking technique; people who rely on dummies, rely on it to learn the critical skills and relevant information necessary for success.

Copyright @ 2000-2024 by John Wiley & Sons, Inc., or related companies. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.

Terms of Use
Privacy Policy
Cookies Settings
Do Not Sell My Personal Info - CA Only