Political surveys in particular are carried out regularly and they are also quite exciting, as they are ultimately intended to be a representative reflection of a society's current voting behavior and thus show us "which wind" is currently blowing in a society and where the opinions in it are heading.
Surely we all know that for a survey result - for cost reasons alone - not all people in a society can be surveyed, but only some of them. These some should be selected in such a way that the sample is one thing above all: a “reduced” image of the population that is as accurate as possible. If this property is fulfilled, then we speak of a representative sample.
In general, surveys are about using a sample to say something about the respective population using a subsequent statistical analysis. However, if a sample is not representative of the respective population, it is distorted and the statements about the population of interest derived from its statistical analysis cannot be trusted. Reputable survey institutes avoid such a mistake and below we will look at why this is so important.
Political polls
Before political elections, but also regularly, people from a sample are asked the classic “Sunday question”, which asks which party someone would currently vote for. With this type of opinion survey you want to find out something about the current performance of parties in the population, for example the voting population of a city. The results of such Sunday questions are usually then disseminated in the media and interpreted accordingly.
There are also surveys in market research and it is not too unlikely that someone has already caught themselves buying a not-so-great hair shampoo or toothpaste etc. because the advertising said that 95% of all Consumers surveyed said that the product could perform “miracles” and that wasn’t the case. And from this we can see exactly what such advertising is actually aimed at: The sheer high approval rating of 95% is intended to convince people to buy this product, which is not surprising since we all know that in the background there is a profit-oriented one Company stands and well, then the toothpaste doesn't taste so fantastic and we'll just buy a different one next time.
Well, okay, but the results of election surveys are more important for a free and democratic society, because (also) they ultimately have an opinion-forming effect and influence people's voting decisions in real democratic elections! This phenomenon, which has been proven and well researched by numerous scientific studies, is known as the bandwagon or bandwagon effect. This effect was first investigated by Paul Felix Lazarsfeld in connection with election forecasts in the USA in the 1960s.
In short, the bandwagon effect manifests itself in the fact that a “winner” of a survey can become even stronger after it is published because people like to follow the majority or the winner. This can be explained by the fact that people draw conclusions for themselves from the decisions of others and tend to follow the majority opinion according to the motto “what is good for others must also be good for me”.
It is therefore obvious that this form of “misleading election advertising” is an extremely effective means of dirty campaigning in the political environment. Especially when surveys are manipulatively influenced in the run-up to elections. And that was exactly one of the accusations in the so-called “leg scraping case”. The accusation was that embellished survey results specifically strengthened the popularity of the then up-and-coming young politician and later Austrian Chancellor Sebastian Kurz. Accordingly, the surveys were widely disseminated in the media and this may have resulted in a bandwagon effect. This procedure was then given a name with the term “leg scraping tool”. To this day, the economic and corruption prosecutor's office is still concerned with the case.
It is more than clear that targeted manipulation of the electorate in forming opinions through strategically placed, spiked popularity ratings of a political party influences the right to vote. The free right to vote is a constitutionally guaranteed principle that is essential to the functioning of a genuine democracy (see Article 26 of the Federal Constitutional Law of the Republic of Austria and Article 38 of the Basic Law of the Federal Republic of Germany).
Example Mien and the orange party
Let's go into detail at this point and look at how exactly a refined survey can be created in advance through targeted manipulation of the sample. We will look at a fictional example below. This example shows us the comparison between a representative and a “marked” sample of 1000 eligible voters and how you can specifically construct a marked sample using simple basic arithmetic.
In the fictional town of Mien, there are 3 electoral districts (called A, B and C for convenience), as well as 3 parties (red, orange and gray parties). A total of 1 million people are eligible to vote and (as in many cities around the world) the popularity of the three parties varies across Mien's electoral districts. The true vote shares in each of Mien's electoral districts are known in this example.
So we assume the following numbers:
- Constituency A: 300,000 eligible voters (red party 70%, orange party 10% and gray party 20%)
- Constituency B: 100,000 eligible voters (red party 55%, orange party 25% and gray party 20%)
- Constituency C: 600,000 eligible voters (red party 50%, orange party 12% and gray party 38%)
An opinion survey is now to be carried out with a sample size of 1,000 people, in which participants will be asked which party they would currently vote for.
Before we take a closer look at this example, there are two things to consider:
Note 1:
We see different results in the individual electoral districts and different proportions of the total population eligible to vote (i.e. 3/10 electoral district A, 1/10 electoral district B and 6/10 electoral district C). If you like, you can create a simple formula that can be used to calculate the overall result for the city of Mien. This results in the following overall results for the red, orange and gray parties:
\(\text{True share of red party} = 70\%*\frac{3}{10} + 55\%*\frac{1}{10} + 50\%*\frac{6}{10} = 56.5%)
\(\text{True share of orange party} = 10\%*\frac{3}{10} + 25\%*\frac{1}{10} + 12\%*\frac{6}{10} = 12.7\%\)
\(\text{True share of gray party} = 20\%*\frac{3}{10} + 20\%*\frac{1}{10} + 38\%*\frac{6}{10} = 30.8\%\)
This makes the red party the party with the largest vote in the city with 56.25%, followed by the gray party with 30.80% and lastly the orange party with a total vote share of 12.70%.
Note 2:
For a sample of 1000 observations, the vote share of a party in the population can be estimated as follows: \[\begin{align} \text{Estimated share of party }k &= \frac{\text{Number of yes answers for party k }}{1000} \end{align}\] The estimated value determined using this formula delivers, as expected (or with a large and representative sample), the true vote share of a party in the population, i.e. if we were to take a large number of such samples, then we would usually get the true value and would only have to expect small and, above all, random deviations of the result from this value. A statistician would say in technical jargon that the estimator from formula (1) is unbiased and consistent for the true proportions of the population, given that the sample is representative of the population, i.e. a “reduced image” of Miens.
Case 1: Representative sample
In this first case, we want to collect a sample from the pool of possible people to be interviewed (i.e. from the pool of all eligible voters) that reflects as accurately as possible the population of Mien who is eligible to vote. For such a true random sample, all eligible voters from Mien have the same probability of being selected for the survey. This also means that we can expect that eligible voters from the three electoral districts will be represented in the resulting sample in accordance with their proportions in the population.
In practice, however, it will often be the case that in a telephone survey, for example, not all types of eligible voters are equally likely to be reached and that it must be ensured in advance of the sample survey that participants - in accordance with their proportion of the population – are also included in the sample. As a result, for such a representative random sample it is to be expected that the estimated shares of all three parties correspond on average to the true shares in the population and the above estimator from formula (1) corresponds to the true share of a party in the population, apart from small, non-systematic ones and therefore random deviations, provides the correct value. This would lead to a representative and unadorned result. But let’s now look at case 2.
Case 2: A whitewashed result for the orange party
Now a sample should be collected, the evaluation of which should show that the orange party is in a stronger position than it actually is, i.e. instead of the true 12.7% in the population, it should come out that they do better by \(x\) percentage points, i.e with \((12.7 + x)\%\) . From the last Mien-wide election we know that the orange party did particularly well in district B and rather poorly in districts A and C. For this example, it is assumed that the true vote shares of the orange party are known to one of its internal strategists, which is roughly true in practice based on available or collectible advance information.
So if we now want to use the background knowledge of the orange party's experienced internal strategist to construct a sample of eligible voters from the three electoral districts in the same way that the blue party is doing \(x\) percentage points better than it actually does, then we can Think about it with simple basic arithmetic:
For \(w_A\) , \(w_B\) and \(1-w_A-w_B\) , the proportions of the sample in the three districts that add up to 1, the following equation applies to the overall result: \[\begin{align* } (12.7 + x)\% &= w_A*10\% + w_B*25\% + (1-w_A - w_B)*12\% \end{align*}\] Now let's fix the im for the orange party \(w_A=0.1\) is to be collected , then we can easily solve the above equation for \(w_B\) and get: \[\begin{align} w_B &= \frac{ (12.7 + x)\% - 12\%}{25\% - 12\%} + 0.1*\frac{12\% - 10\%}{25\% - 12\%} \end{align}\ ] So we now know that if we interview 10% of the 1000 people to be surveyed in constituency A, i.e. 100, we can calculate the proportions to be surveyed in constituencies B and C according to formula (2) or from the context \( w_C = 1 - w_A - w_B\) can be easily calculated.
The relationship in formula (2) describes a linear function (i.e. a straight line). Below we have drawn this function. The additional desired percentage points \(x\) plotted on the x-axis and the corresponding proportion of eligible voters who have to be surveyed in district C are plotted on the y-axis in order to reach the desired result of \((12.7 + x)\ %\) get. For example, if it were intended that instead of the true share of the orange party there would be a share of 18%, i.e. mum \(x=5.3\) percentage points more than in reality, then we can simply fix this value on the x-axis and the corresponding value look on the y-axis, which in this case results in a proportion of around \(w_C=0.48\) for the part of the sample to be collected in district C. We then know in whole numbers that of the 1000 people to be interviewed, we have to interview 480 of them in District C. This reading example of how this connection works is marked in the graphic by the broken gray lines.
So we see that simple basic mathematical knowledge from middle school is sufficient to specifically manipulate the composition of a sample.
For geeks: random number simulation
For particularly interested readers, we are now carrying out a random number simulation based on the considerations above. If that's too much math for you, you can simply skip this section, because we've already looked at the essential basic considerations in the previous section.
For the simulation, we assume that the orange party would like to have a desired result of around 18% support instead of the true 12.7% according to the survey.
To do this, in the first step we simulate the fictional city of Mien with its 1 million eligible voters, distributed across the 3 electoral districts with different political preferences as assumed above.
In the next step, 10,000 samples of size 1000 are drawn from this population, once for a representative random sample and once for a marked sample, in which the orange party - in accordance with the above considerations for formula (2) - scores with approx. 18% instead with the true share of 12.7%. For each of the 10,000 repetitions of this random experiment, the estimated vote shares of each of the three parties are then calculated according to formula (1) and at the end their distribution across all repetitions is displayed graphically.
The results of this random number simulation are summarized in the graphic below using so-called boxplots, above for the representative random sample and below for the marked sample. (A box plot or box diagram represents a frequency distribution represented by the reference points of median, first and third quarters, and maximum and minimum values.)
The dashed lines in the two graphics represent the average of the shares of the three parties estimated according to formula (1) over the 10,000 drawings and the colored lines in red, orange and gray represent the true values in the simulated population of 1 million eligible voters :Inside. If the corresponding colored and dashed lines lie exactly on top of each other, as is the case in the upper graphic for the representative sample, then the estimator from formula (1) delivers the true value on average. In the lower graphic for the selected sample, the deviation in this regard for the orange party is immediately apparent: In fact, the orange party receives an average of around 18.1% of the votes instead of the true 12.7%.
The maximum and minimum values of the scatter of the simulated empirical results over the 10,000 repetitions of our random experiment shown by the whiskers (black ends at the top and bottom of the individual box plots) also give us an impression of how large and how likely chance-driven deviations from the average value are all repetitions can be within the framework of a single sampling. In fact, the minimum value for the orange party over all 10,000 repetitions is 14.1% and the maximum value is 22.5%, although a value of 15.4% is only lower in 1% of all repetitions and a value of 20.9% is only lower in 1% of all Repetitions exceeded.
Beautifying surveys is dubious!
Above we looked together at how easy it actually is to specifically manipulate a sample so that the result of a survey based on it goes in a desired direction.
Basically, even basic mathematical knowledge from middle school is enough to derive the relevant considerations. Embellished surveys serve to shape opinions and hope for the bandwagon effect. This affects the free right to vote and such “dirty campaigning” strategies are an attack on our democracy.
Dubious surveys can be revealed by, among other things, being published without naming the true client or even being carried out on behalf of a party. It is therefore not only important for us to know who commissions surveys, but also to be able to evaluate the institute conducting the survey (if it is not a marketing agency) and the sample.
Before interpreting the results of a Sunday question, we must first pay critical attention to the representativeness of the data for the corresponding population! The type of survey is also important, as there are differences between telephone and online voting. Sounds banal, but here too I can intervene manipulatively: A telephone survey based on landline numbers will definitely produce a different (possibly more conservative) result than a survey using online forms.
So if someone knows in advance how certain people will behave and what results to expect, I can indeed manipulate the results. That's why whenever someone presents you with a survey, you should always ask about the composition of the sample and you shouldn't trust institutes that don't act transparently.
Author: Dr.sc.ETH Sabrina Dorn , statistician
Editor: Andre Wolf
This might also be of interest : “If we are no longer allowed to say ‘Grüß Gott’…”. A friendly “Grüß Gott” to greet the students is apparently not appropriate. Continue reading ...
Note: This content reflects the current state of affairs at the time of publication
.
The reproduction of individual images, screenshots, embeds or video sequences serves to discuss the topic.

