All posts by druescas

Liquid filtering

In our talk presented here we listed several problems involved in scaling democracy, one of them was the problem of having voters decide from many possibilities, which we referred to with

I can’t decide from too many options

Information overload “the difficulty a person can have understanding an issue and making decisions that can be caused by the presence of too much information.” (Toffler)

Here we will present an idea towards fixing this problem. Briefly, the aim is to combine ideas from collaborative filtering with liquid democracy. Wikipedia on collaborative filtering:

The growth of the Internet has made it much more difficult to effectively extract useful information from all the available online information. The overwhelming amount of data necessitates mechanisms for efficient information filtering. One of the techniques used for dealing with this problem is called collaborative filtering.

This description fits very well with our problem. If you haven’t heard of collaborative filtering before, think amazon/lastfm/netflix recommendations, or digg filtering of stories. Here’s a quote from Anton Kast, lead scientist at Digg

The idea of collaborative filtering is simply to combine the input from many different people to filter information better than would otherwise be possible. In particular, you use information from many independent judgements by many people… On Digg anyone can submit a story. And anyone can vote on any story – that’s the filtering part, and whatever is most popular wins. It’s a giant collaborative filter in the simplest, classic sense

With that out of the way, how do we combine collaborative filtering with liquid democracy? The answer is simple, whereas in classic collaborative filtering user affinities are implicit in user behaviour and are inferred statistically by CF systems, liquid filtering makes user affinities explicit, in the form of delegation. After all, ‘user affinity’ is nothing but ‘delegation’ in different clothes. And as in liquid democracy, transitivity can be mixed in; the interpretation is straightforward: If I trust A’s judgement who trusts B’s judgement, then I trust B’s judgement.

So, let’s get back to the probem we started with. We have a situation where we have to collectively choose among many options. In the presentation this was referred to as scaling the solution space. It is infeasible to apply voting as is, because voters cannot consider all these options to make a judgement. So what we do is to distribute the cognitive load in a way that reflects user delegation. The problem of liquid filtering is the assignment of voters to questions according to delegation choices, in an optimal way.

Where’s the math?

Here’s a simple formalization. Note that things can be made more sophisticated to reach specific criterions of optimality. For example, one objective could be to reduce transitive chains lengths such that user delegates are in fact aligned with the user’s judgement. But in this post we will restrict optimality to mean:

minimize the number of questions that each voter has to answer while ensuring that every voter votes on each question

Some definitions follow. A set of questions to be made

Q = { q1, q2, q3 … }

A set of voters to make a decision

V = { v1, v2, v3 … }

Voter delegation choices define a binary relationship over V

D = { (v, v’) | v,v’ ∈ V }

The transitive closure of D is defined as D+. It encodes transitive delegation information, we will use it shortly. Now, what we want is to produce an assignment of voters to questions. Assignment is a binary relationship over V and Q

A = { (v, q) | v ∈ V, q ∈ Q }

However, assignment as is does not consider delegation. So we define a transitive assigment set T according to the following set comprehension

T = { (v, q) |  ∃ (v’, q) ∈ A ∧ ∃ (v, v’) ∈ D+ ∧ v, v’ ∈ V, q  Q }

T is the meat of liquid filtering. It encodes the fact that although voters may not answer a question directly, they can answer it by virtue of having one of their delegates answer it for them. In this way T is the liquid filtering equivalent of a liquid democracy algorithm that walks the nodes to determine a user’s vote when that user does not vote directly.

We can finally state the aim of liquid filtering in terms of our definitions. Recall:

minimize the number of questions that each voter has to answer while ensuring that every voter votes on each question

which we can decompose into

minimize the number of questions that each voter has to answer

minimize |A|

and

while ensuring that every voter votes on each question

∃ (v, q) ∈ T  ∀ q  Q, v ∈ V

In short

minimize |A| subject to ∃ (v, q) ∈ T  ∀ q  Q, v ∈ V

 How does this address the initial problem? The answer is that by virtue of delegation (T), we have:

|A| ≤ |V| x |Q|

The number of questions necessary to make a decision is less than that if all voters had to decide one every single question (hence this slide). With no delegation, the above equation reduces to equality. The degree to which the reduction of questions occurs is hard to foresee. It depends on how many delegation choices are made, and how those choices are made. Different delegation graphs can lead to different results. Then there’s the matter of implementation.

Where’s the code?

What remains is to write an algorithm that obtains A from the inputs V, Q and D. This algorithm should be optimal in that it minimizes |A|, but other optimality constraints can be added. One is delegation chain length[1]. Another is a threshold such that even if the sum total of voters is minimzed, one does not want to overload specific voters with too many questions. There are many possibilites.

Two other considerations to note.

First, in the above we have assumed that all voters must have a choice for all questions. If we relax this constraint, and use statistics to infer probable choices, we are talking about sampling[2], as in a poll. But even if we are doing sampling, delegation information can still be used to improve the quality of the results, or equivalently reduce the number of questions necessary for a given quality.

Second, it could be possible to write an algorithm that adaptively calculates assignments of voters to questions using results as they become available. This would probably use statistical inference to determine which questions can lead to a final result more quickly,  something along the lines of bayesian experimental design comes to mind.


[1] We could also define a distance matrix over V as specified by D. This would encode the distance between voters in terms of the delegation jumps. The distance matrix is a useful if we are interested in limting delegation chains as described earlier

[2] See appgree

Cryptographically secure voting

Last week marked an important milestone for us here at AgoraVoting. We conducted our second excercise in direct democracy via a spanish congressman. Not only that, this was the first time we carried out an election using our recently completed support for secure voting. All votes were private and encrypted, and the election process and tally was publically verifiable.

This is something we have been working towards for a long time, as described in our design overview published here (english version). This design is not yet finalised, as we still need to add a crucial element, support for secure liquid democracy. But we are well on our way to achieving this.

Many different components work together to achieve the end goal of cryptographically secure voting[1]:

The heart of the voting system, including a web frontend as well as rest API.

Agora’s tallying component, uses openstv

A standalone election verifier, making Agora elections publically verifiable.

An orchestration component built over frestq that coordinates election authorities

A vagrant-puppet project to streamline election authority set up

A rest federated task queue to distribute jobs over https (or http)

A javascript cryptography library we use to encrypt votes, uses SJCL

A collection of tallying implementations for ranked voting

The cryptographic backend, the heart of the secure voting technology.

As you can see, realizing our initial vision is no mean feat, but we feel that we have made an important step in this direction. We’d like to thank everyone who has participated and helped along the way. Keep up the good work!

– The AgoraVoting Team

 


[1] Note that we do not claim here that Agora is literally secure, no system is 100% secure in this sense. What we mean by this term is that Agora employs cryptographic techniques from the field of secure voting as defined in the research literature. See for example

http://lefkimi.ionio.gr/~emagos/overview_voting_2002.pdf

http://www.ee.washington.edu/research/nsl/papers/JCS-05.pdf

for an overview.

Vote inference and liquid recurring elections, example solution

Quick recap of the problem presented in the last entry

Consider the following recurring election scenario:

  • Three candidates: Carl, Catherine and Colin
  • Three voters: Alice, Ann and Bob
  • Two elections: Election One and Election Two

In Election One, the following results are tallied

Carl 1
Catherine 1
Colin 1

In Election Two we have the following tallies

Carl 2
Catherine 1
Colin 0

furthermore we know from ongoing voting records that Alice and Ann may have changed their votes between Election One and Election Two while Bob did not.

Just one clarification. It may not have been clear why “we know whether voters changed their votes between elections”. The reason is found in the security scheme designed to support liquid democracy. Supporting liquid democracy requires that voters can specify a delegate, and that voters can change said delegate at any time. In such a scheme a bulletin board corresponding to the delegation votes contains ciphertexts that are valid for ongoing elections. Because a choice of delegate persists, a given ciphertext must be valid for several elections. Furthermore, the ability to change one’s delegate requires the ability to overwrite such ciphertext at any time.

Therefore, if a bulletin board contains a ciphertext that is not modified over a certain period of time, it implies that the corresponding voter has not changed delegate. Additionally, if a ciphertext does change at some point in time, it implies only that the corresponding voter may have changed their delegate from that point onwards. These implications are what were given as an additional ingredient in the problem statement.

we know from ongoing voting records that Alice and Ann may have changed their votes between Election One and Election Two while Bob did not.

Here’s the solved table, after Election One. The result is trivial given that there are three choices and three voters who each received one vote.

Carl

Catherine

Colin

Alice

33%

33%

33%

Ann

33%

33%

33%

Bob

33%

33%

33%

And here’s the solved table, after both elections

Election One

Election Two

Carl

Catherine

Colin

Carl

Catherine

Colin

Alice

25%

25%

50%

75%

25%

0%

Ann

25%

25%

50%

75%

25%

0%

Bob

50%

50%

0%

50%

50%

0%

One important thing to note is that after Election Two it was possible to obtain probabilities about Election One that were previously different[1]. In other words, information from a given election allows inference about elections other than itself, the information propagates backwards. 

In the next posts we’ll see how this works.

 


[1] Note that if we had added the “changed vote assumption” to the calculation, it would have been certain that Bob voter for Carl in both elections.

Vote inference and liquid recurring elections, an example

One of the things we’ve been thinking about here at Agora voting is vote inference and recurring elections:

How much information about casted votes can be inferred from analysing historical records of recurring (in particular, recurring delegation) election tallies?

Besides being an interesting problem in itself, this problem is relevant for the purposes of securing liquid democracy via parallel ongoing delegation tallies, as is our approach. Today we’d like to suggest a simple example that motivates the discussion. In the next post we will present some ideas and progress we’ve had in this area.

Consider the following recurring election scenario:

  • Three candidates: Carl, Catherine and Colin
  • Three voters: Alice, Ann and Bob
  • Two elections: Election One and Election Two

In Election One, the following results are tallied

Carl 1
Catherine 1
Colin 1

In Election Two we have the following tallies

Carl 2
Catherine 1
Colin 0

furthermore we know from ongoing voting records (see the voting scheme supporting delegation) that Alice and Ann may have changed their votes between Election One and Election Two while Bob did not.

The question is, what can be inferred about who voted for which candidate from the known data? What can be inferred after Election One, and what can be inferred after both Election One and Election Two? The table below summarizes all the questions that can be asked. Question marks stand for probabilities, given that in general there is insufficient information to answer these questions conclusively.

Election One

Election Two

Carl

Catherine

Colin

Carl

Catherine

Colin

Alice

?

?

?

?

?

?

Ann

?

?

?

?

?

?

Bob

?

?

?

?

?

?

Can you fill this in?

Agora development on windows

So, you want to hack on Agora but are stuck on Windows? Not to worry, just follow these simple steps and you’re ready to go!

  • Inside your machine create an empty folder named agora-ciudadana
  • Install virtualbox (you can use other virtualization software, but we will use vbox here)
  • Download a linux operating system image, here we will use Ubuntu
  • Run the vm and install the operating system
  • Set up port forwarding, for example on port 8000
  • From the running vm window, select Devices > Install Guest Additions
  • From the running vm window, select Devices > Shared Clipboard and set to Bidirectional
  • From the running vm window, select Devices > Shared Folders..
    • Add a shared folder (top right button) named agora-ciudadana that points to the folder created in step 1
    • Press Ok
  • Inside your linux vm, create a directory agora-ciudadana, this will mirror that created in step 1
  • Mount the windows directory with
sudo mount -t vboxsf agora-ciudadana agora-ciudadana

This allows you to access your windows directory created in step 1 from your linux vm. For best results, you want to set this up to run on vm startup. Depending on what Ubuntu you installed (Desktop or Server) you can do this via Startup Applicationsor /etc/rc.local.

  • from the parent directory of your agora-ciudadana directory, clone the agora repository
git clone https://github.com/agoraciudadana/agora-ciudadana.git

And that’s it. Now you can follow the standard linux agora installation as described in the INSTALL.md file. These and any other tasks should work just as they would in a native linux installation, with one exception: you should run the server like this

./manage.py runserver 0.0.0.0:8000

this allows you to access your application from windows at http://localhost:8000

With this set up the agora-ciudadana files reside in the host file system, meaning you can use your favorite windows text editing tools and browsers for development. It also means that if for whatever reason the vm breaks or is corrupted, the agora files will not be lost. And because of the Bidirectional clipboard setting, you can freely copy and paste between your vm and windows system.

Have fun!

Research at Agoravoting

Besides writing great code, at Agoravoting.com we like to think about interesting and difficult problems related to liquid democracy and decision making. Here are some of the things we have our sights on at the moment:

  • Secure liquid democracy

How can liquid democracy be secured[8]? Can existing secure voting schemes be extended to support delegation?

Despite the fact the delegates’ votes are public, the choice of delegate itself remains secret. If you decide to delegate your vote to, for example, “Richard Stallman”, that choice of delegate will be secret. If you decide to create your a delegate named “John Doe” and delegate your vote to your own delegate, that choice of delegate will also remain secret, although your vote as the delegate “John Doe” will be public. So in summary, votes cast by delegates are public, but voters’ choice of delegate is not.

The delegated voting system that we have designed is based on the use a parallel and continuous voting phase that tallies votes that represent a voter’s choice of delegate.

  • Liquid democracy schemes

What methods (eg simple, transitive, dampened transitive, conditional) can be used to implement liquid democracy? What is the complexity-usability tradeoff?

  • Vote inference and recurring elections

How much information about casted votes can be inferred from analysing historical records of recurring (in particular, recurring delegation) election tallies?

When having recurring elections there’s information that can be extracted from the changed votes between any two tallies, even if the votes are secret. We present a path-based vote inference algorithm that collects this information to calculate the probability with which each voter voted to each candidate, and the results of applying this algorithm to real elections.

  • Delegate recommendation and auto-delegation

What machine learning (eg k-NN[1], PCA[2]) methods can be used to recommend delegates that are likely to be a match for voters? What are the optimality conditions required for delegation to be effective?

The set of votes cast in a liquid democracy can be used to define a vector space over voters. Voter similarity can thus be specified in terms of a distance function, or metric, over this space. The question is, how to represent voters as vectors?

The most obvious approach would be to assign one dimension to each election. The content of the vote cast in election n would be the nth component of the vector…

  • Efficient vote time loop detection

How can delegation loops be efficiently detected such that votes are not lost in cycles?

In pre-emptive vote-time loop detection, the system must calculate, for every user of the application, what choices of delegate would result in a delegation loop.  Thus the path begins at each possible user. Because such a calculation is expensive, it should be done “offline”, prior to the display of the delegate choices. The result of such a calculation is then stored so that it can be looked up to determine problem delegates whenever presenting the delegation vote interface.

How can elections be visualized in a scalable way? What visualizations are possible for systems beyond simple plurality (FPTP[7])?

  • Liquid democracy and spectral ranking[3]

What is the relationship between spectral ranking techniques (eg PageRank[4], Katz centrality[5]) and liquid democracy tallying?

  • Agent based models of liquid and representative democracy

Democracy as voting can be conceptualized as a social information processor[6], composed of a set of individual processors (the voters) and an integrator (the voting system) that together produce global decisions.

Can one speak precisely about the quality of these decisions? Can the concept of performance and error be meaningfully defined?

  • Holistic decision making processes

What processes and tools can encompass decision making as a whole?

Democracy is not just about voting, but about the wider problem of making decisions in a community. Good systems for making decisions are more likely to yield decisions that reflect the community’s preferences and exploit its collective knowledge. A decision making system, therefore, takes as input preferences and knowledge and produces as output a decision. Such a decision making system is effectively an information aggregator; high quality decisions are the result of aggregating as much information as possible: no preferences are ignored, and no valuable knowledge is left unused.

If you’re up for a challenge, come and and help us tackle some of these problems!


References

[1] https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

[2] https://en.wikipedia.org/wiki/Principal_Component_Analysis

[3] http://arxiv.org/abs/0912.0238

[4] http://en.wikipedia.org/wiki/PageRank

[5] https://en.wikipedia.org/wiki/Katz_centrality

[6] http://en.wikipedia.org/wiki/Social_information_processing

[7] https://en.wikipedia.org/wiki/First-past-the-post_voting

[8] http://www.ee.washington.edu/research/nsl/papers/JCS-05.pdf

Open proposal elaboration

(image from cartoonmovement.com)

Democracy is not just about voting, but about the wider problem of making decisions in a community. Good systems for making decisions are more likely to yield decisions that reflect the community’s preferences and exploit its collective knowledge. A decision making system, therefore, takes as input preferences and knowledge and produces as output a decision. Such a decision making system is effectively an information aggregator; high quality decisions are the result of aggregating as much information as possible: no preferences are ignored, and no valuable knowledge is left unused.

This brief description brings us to the distinction between voting and the general case of  decision making. In traditional voting, voters are asked to cast a vote which represents a choice between a predefined set of already existing alternatives. The alternatives presented to the voter are fixed prior to the vote, they have been previously elaborated by some other agent independent of the voters. The voters simply choose among the set presented to them.

In the terms presented above, the knowledge and preferences that voters input into the system only serves to discriminate between a predefined set of options. If the process by which the options were elaborated has not previously already aggregated this information and materialized it into the options themselves, that information could be lost.

In other words, voting is too coarse grained to aggregate the information in all its richness. The grain is determined by the options, if these options do not previously already represent the best quality decisions, the resulting decision will be suboptimal. The optimal decision implicit in the lost information is unavailable as a choice.

The rise of information technology is very relevant to this problem. It is clear, nominally speaking, that a problem of aggregating information is directly related to information technology. More specifically, the communication possibilities that the modern day internet afford are precisely those that are necessary to open up the possibility of aggregating information from hundreds of thousands of sources in a much more refined way than the coarse grained procedure of voting allows.

The key idea is this: information technology and the internet allow extending voters’ participation from just the voting phase, back in time to include the proposal drafting/elaboration (what before were merely fixed options) phase. In contrast to the coarse grain input that voting allows, contributions to a proposal as it is being written are fine grained; modifications to proposals can be arbitrarily small and precise. From adding (or removing) a comma, to a sentence, to a paragraph, to an entire proposal.

We call this opening up of proposal elaboration to the general voter population Open / Distributed / Federated proposal elaboration.

It is open because anyone can contribute. It is distributed/federated because there is no central authority that drafts proposals and imposes them on the voters. Proposals are drafted spontaneously in a distributed fashion. In conclusion, Open/Distributed/Federated proposal elaboration is a decision making medium that need not suffer from information loss that can affect democracy-as-voting. This strong focus on participation makes this approach strongly related to ideas such as collaborative governance[1] and participative democracy[2].

There are two sides to the coin, however. In opening up proposal elaboration to the general public, the gates to more information are opened, but they are also opened to too much information. The key then, is to balance the benefits of the extra possibilities and ideas that this system allows with the extra noise and less useful contributions that can potentially flood the medium. If the signal to noise ratio is too low, optimal decisions can be discarded, not because the required information was not present as we discussed,  but because those decisions were buried under noise that makes them impossible to find.

In summary, a three stage model of information aggregation as decision making is proposed. Although we use the term stage, these three opportunities need not occur chronologically, they may occur in any order and repeatedly.

  1. The first stage of decision making occurs when a set of individuals sets about collaboratively writing[3] or modifying an existing proposal. Ideally this occurs in realtime, these individuals discuss and debate the changes as they are being made. Or they can exchange views or ideas with more traditional means of interaction, offline. In any case, the proposal is the result of their collaboration and contribution, information in terms of preferences and knowledge is integrated to yield the proposal.
  2. The second stage is proposal cross-fertilization[6]. Whereas the focus on aggregation in the first stage was accross individuals collaboratively writing a proposal, in this stage the aggregation occurs across proposals. Proposals may be forked, merged, recombined and mutated. Sections from one proposal may be imported/transcluded into another one. Although the process is still carried out by individuals, proposal cross-fertilization does not  only gather contributions from a subset of users collaboratively making modifications, but rather contributions are drawn from the entire proposal pool.
  3. The third stage is proposal selection, which corresponds to our traditional idea of voting. Having a given pool of proposals that have evolved during the other two stages of aggregation, the voting population is asked to vote for one (or many).

Due to the problems of noised mentioned above, it may be necessary to add extra stages that act as filters. For example, a filter could discard proposals that are not supported by a threshold number of participants.


References

[1] http://en.wikipedia.org/wiki/Collaborative_governance

[2] http://en.wikipedia.org/wiki/Participatory_democracy

[3] By collaborative proposal editing we mean realtime online editing of document by several individuals. The most common examples of this today are found in Google Drive and Etherpad.

[4] http://en.wikipedia.org/wiki/Google_Drive

[5] http://en.wikipedia.org/wiki/Etherpad

[6] Tools that support this model are for example Wikis[7] and Version control systems[8]. The idea of cross fertilization has been suggested in [9]

[7] http://en.wikipedia.org/wiki/Wiki

[8] http://en.wikipedia.org/wiki/Version_control

[9] http://zelea.com/w/Stuff:Votorola/p/position_space_rationalization

Vote-time loop detection and transitive closure

tc

Previously we have spoken about vote-time loop detection as a mechanism to prevent votes from being lost, as the tally in Agora will discard votes that enter a cyle

vote-time loop detection is a mechanism that detects cycles just as they are formed, and prevents them from being established. A feedback message instructs the voter that his/her choice is invalid as it would create a loop, the user must make another selection. How does this detection algorithm work? Well essentially in the same way as it does in the tally algorithm, except that it only checks  loops starting from delegation choices just as they are made by a user, on demand.

But this is not the entire picture, for the following reasons

  • Delegation loops do not invariably result in lost votes. In the event that any of the members in the cycle votes directly, that member will channel all the delegation weight  present in the loop. Still, loops are in principle undesirable.
  • Instead of allowing the user to select an invalid (see the previous point) option and then requiring them to change it, the system could identify problem delegates before the user selects one, pre-emptively.
  • Following the point above, the vote-time loop detection algorithm would have to operate differently, as the starting point would not be a specific choice of delegate, but rather all possible choices. Using the tally-time algorithm would be computationally expensive.

This leads us to two different possibilities, pre-emptive vote-time loop detection, and reactive vote-time loop detection. Reactive vote-time loop detection corresponds to our earlier notion of vote-time loop detection. When a user selects a delegate, the system runs an algorithm that, based on that selected delegate, detects where a loop exists. That is, loop detection begins from the path User => Delegate-choice. This algorithm can proceed by walking the graph, and keeping a record of already visited nodes.

In pre-emptive vote-time loop detection, the system must calculate, for every user of the application, what choices of delegate would result in a delegation loop.  Thus the path begins at each possible user. Because such a calculation is expensive, it should be done “offline”, prior to the display of the delegate choices. The result of such a calculation is then stored so that it can be looked up to determine problem delegates whenever presenting the delegation vote interface.

So how does this data look like? It must contain all the delegation graph information necessary to carry out loop detections cheaply, for all possible users. In effect, this is a matter of graph reachability, and the structure we are looking for is the transitive closure of the relation defined by all existing delegate votes (that is, the delegation graph). The transitive closure of a relation can take the form of a binary adjacency matrix, where a ‘1’ represents that two vertices are reachable, and ‘0’ otherwise.

tc2

The transitive closure (TC) of a directed graph can be calculated using Warshall’s (pdf) algorithm that runs in O(n^3) time. Let’s consider how this could be implemented. First consider that the transitive closure can change whenever the delegation graph changes, that is, whenever a delegate vote is cast, changed or removed. So a naive approach would be to calculate the TC each time this happens, and store it for use during lookup.  Since the TC is an adjacency matrix, it can be easily represented as a database table with two columns.

Lookup is also very easy, as transitive reachability is encoded directly in the adjacency matrix. To determine if a delegate D will cause a loop if selected by user U, we simply have to ask if U is reachable from D, because if it were, the extra path would cause a loop. If stored in a rdbms table, the sql to determine if D is problematic is simply

SELECT * FROM TC WHERE SOURCE = D AND TARGET = U

which would return 1 row if U is rechable and 0 rows otherwise.

But perhaps a naive approach would be inefficient. For every delegated vote cast this would entail

  • Loading all delegated votes (the entire delegation graph)
  • Running Warshall’s algorithm

A better solution would be to calculate the transitive closure incrementally, such that only the varying parts of the TC are recalculated. Even better, if the delegation graph is stored in a database, it would be more efficient if the calculation could be done locally on the database without having to load votes and then run the algorithm.

These two desirable properties bring us to the paper Maintaining Transitive Closure of Graphs in SQL, where such a method is described. As an added benefit, it is formulated in terms of simple sql, requiring no specialized extensions that some databases offer that support graph related operations. We can implement this method as two stored procedures[1] that serve to incrementally maintain the TC table.

As an example, we will construct the TC for the following graph

tc3

which is reflected in the following sql session

tc4

This session shows the edge insertions and the resulting transitive closure table.  Now imagine that delegate 4 has decided to delegate his vote. What delegates would cause a loop? The answer is a straightforward lookup on the TC table:

SELECT START FROM TC WHERE END = 4

which returns the rows with values 1, 2, 3, 5, since any of these will result in a loop.

What delegates would cause a loop for delegate 2? We have

SELECT START FROM TC WHERE END = 2

 which returns rows with values 1, 5.

As can be seen from these examples, detecting possible loops using the TC table is very easy, provided the TC is maintained properly as delegate votes are casted, changed or removed.

 


[1] Following are two stored procedures based on this method implemented for mysql.

Adding an edge

 

-- --------------------------------------------------------------------------------
-- Routine DDL
-- Note: comments before and after the routine body will not be stored by the server
-- --------------------------------------------------------------------------------
DELIMITER $$

CREATE PROCEDURE `test`.`ADD_EDGE` (IN source INT, IN dest INT)
BEGIN

drop table IF EXISTS TCNEW;

CREATE TABLE TCNEW AS
SELECT *
FROM (SELECT TC.start as start, dest as end
FROM TC
WHERE TC.end = source
UNION
SELECT source as start, TC.end as end
FROM TC
WHERE dest = TC.start
UNION
SELECT TC1.start, TC2.end
FROM TC AS TC1, TC AS TC2
WHERE TC1.end = dest AND TC2.start = source
) AS T;

INSERT INTO TCNEW (Start, End) VALUES (source, dest);

DROP TABLE IF EXISTS DELTA;

CREATE TABLE DELTA AS
SELECT * FROM TCNEW
WHERE NOT EXISTS (SELECT * FROM TC, TCNEW WHERE TC.start=TCNEW.start AND TC.end=TCNEW.end);

INSERT INTO TC
SELECT *
FROM DELTA;

SELECT * FROM TC;

END

Removing an edge

-- --------------------------------------------------------------------------------
-- Routine DDL
-- Note: comments before and after the routine body will not be stored by the server
-- --------------------------------------------------------------------------------
DELIMITER $$

CREATE PROCEDURE `test`.`REMOVE_EDGE` (IN source INT, IN dest INT)
BEGIN
DROP TABLE IF EXISTS SUSPECT;

CREATE TEMPORARY TABLE SUSPECT AS
SELECT * FROM (
SELECT X.Start as Start, Y.End as End FROM TC AS X, TC AS Y WHERE X.End = source AND Y.Start = dest
UNION
SELECT X.Start as Start, dest as End From TC AS X WHERE X.End = source
UNION
SELECT source as Start, X.End as End FROM TC AS X WHERE X.Start = dest
UNION
SELECT source as Start, dest as End FROM TC AS X WHERE X.Start = source AND X.End = dest
) AS T2;

DROP TABLE IF EXISTS TRUSTY;

CREATE TABLE TRUSTY AS
SELECT * FROM
(SELECT * FROM TC WHERE NOT EXISTS
(SELECT * FROM SUSPECT WHERE SUSPECT.Start = TC.Start AND SUSPECT.End = TC.End)
UNION
SELECT * FROM G WHERE G.Start <> source AND G.End <> dest) as T3;

DROP TABLE IF EXISTS TCNEWD;

CREATE TABLE TCNEWD AS
SELECT * FROM
(SELECT * FROM TRUSTY
UNION
SELECT T1.Start, T2.End FROM TRUSTY T1, TRUSTY T2 WHERE T1.End = T2.Start
UNION
SELECT T1.Start, T3.End FROM TRUSTY T1, TRUSTY T2, TRUSTY T3 WHERE T1.End = T2.Start AND T2.End = T3.Start)
AS T4;

DELETE FROM TC WHERE NOT EXISTS (SELECT * FROM TCNEWD T WHERE T.Start = TC.Start AND T.End = TC.End);

SELECT * FROM TC;
END

Agora, a virtual parliament

Note: this article is an adaptation and translation of the article previously published on Security by Default

The Agora Ciudadana project aims to develop an internet voting system that is cryptographically secure, supports vote delegation, and scales well to massive elections. Agora is free software. These ambitious requirements may seem almost unreachable. Nonetheless, this is what our development team is attempting, and we are reasonably confident that it is possible.

The project objectives are those mentioned above, they are clear cut and arise from the idea behind Partido de Internet (PDI), political party where Agora’s development began. PDI is a non partisan party that does not have, nor will ever have, a political ideology. It has a single and radical proposal: PDI elected representatives will vote in congress according to what the people have previously voted through the internet using Agora.

Secure Voting Scheme

Agora’s security scheme is based on voter authentication and digital signing of votes through the DNIe, and a secure voting scheme based on Mixnets and ElGamal encryption, with the added novelty of support for vote delegation. No new cryptographic primitives are proposed or developed, rather existing and well established protocols are used.

Cryptographically secure voting schemes are generally not well known to the public, but they do provide several desirable characteristics that physical voting protocols lack, such as the possibility that anyone can mathematically verify that the election result is correct (this property is called Universal Verifiability) Additionally, the secrecy of the votes depends on a collection of authorities, in a way that as long as one such authority remains honest, the secrecy of the vote is guaranteed.

How an election works

1. Key generation

In a voting system based on mixnets, the first thing that happens to run an election is the selection of several authorities that will jointly guarantee the secrecy of the vote. Each authority generates an ElGamal public/private key pair, and shares the public key. All such public keys are combined using a simple mathematical procedure that generates a joint public key that will serve as the election’s public key. This is the key that voters will use to encrypt their votes. The public key will be published in the bulletin board, a board where all the public data for the election will be published for anyone to see.

The authorities will also make their individual public key available, so that anyone can verify that the joint public key does in fact correspond to the public keys of all the authorities.

The larger the number of authorities the better, as this increases the security of the election, since more authorities would have to be corrupted to compromise vote secrecy. However, a larger number of authorities also implies a greater computational cost for the election.

2. Vote reception

Once a public key is created for a specific vote, the vote text is established along with the voting period, that is, the time window during which voting is allowed. Once the voting period starts, voters may begin casting votes through the web, or using the voting system’s public API.

Agora is conceived such that any spanish citizen aged 18 or over can vote. To ensure that only eligible citizens may vote, as well as to avoid duplicate votes, Agora uses DNIe based voter authentication. The voter, either through a desktop program or through an applet, selects an option for the vote, encrypts the vote and finally signs it digitally with the DNIe. The digital signature allows voter authentication, that both ensures that voters are eligible as well as avoids the possibility of duplicate votes.

The encrypted (and therefore secret/private) and signed votes will be published on the bulleting board so that anyone can check whether a specific voter has voted, and individual voters can verify that the vote received by the system matches their vote as cast.

3. Vote tallying

Once the voting period ends, vote tallying begins. For a voting system based on Mixnets this proceeds in two phases. First, votes are anonymized such that it is impossible to link a vote to the person who cast it. Then the votes are decrypted and then tallied as plaintexts. This process is carried out by the election authorities.

We refer to this system as being “based on mixnets” because it uses a mixnet to anonymize votes: each authority re-encrypts and re-shuffles votes. Each authority re-encrypts votes in such a way that for example if a voter selected “option 1” and encrypted it as encryption(“option 1”, electionKey) = A (where A is the ciphertext), th re-encryption that authority 1 carries out is re-encryption(A, authority1Key) = A‘ = re-encryption(encryption(“option 1”, electionKey), authority1Key) such that A and A‘ are ciphertexts that appear completely unrelated but correspond to the same plaintext. This process is carried out by every authority so that the end result is something of the form encryption(encryption(“opción 1”, electionKey), electionKey) = A”.

Because of the homomorphic property of ElGamal systems, the decryption of the final re-encrypted ciphertext results in the original plaintext. The correspondence with partial re-encryptions is impossible to establish. This, together with the fact that authorities also reshuffle re-encrypted votes, guarantees that the final, one step decryption of the ciphertexts does not compromise privacy. In summary, the link from ciphertext to plaintext is broken, the vote is thus anonymized.

Once the encrypted votes have anonymized, the authorities jointly decrypt them, the tally is then trivial. The results are published on the bulletin board. Thus, the original ballots emitted by the voters are never decrypted, their secrecy is maintained. The authorities themselves do not know the voter’s choice. Only if all the authorities colluded would it be possible for them to decrypt one or several votes. For this reason a greater number of authorities grants the system more security.

As stated before, the elections are universally verifiable. This means that even if all authorities were corrupt it would be possible to detect election fraud. This is true by virtue of the fact that for each of the steps of the election processing (re-encryption, shuffling and decryption), the authorities must provide mathematical proofs that they operated correctly, and must publish these proofs on the bulletin board. These proofs are zero-knowledge, that is, the content of the proof reveals no information, so secrecy is again maintained. I highly recommend the wikipedia article on zero knowledge proofs (based on the the paper “How to explain zero knowledge proofs to your kids”) to understand how they work.

Delegated voting scheme

According to my calculations, approximately 6600 votes take place per year, only counting congress. This corresponds roughly to a vote every hour, on average. Not many people will be willing or able to excercise their vote in an informed maner given such a high rate of voting events, especially considering that many of these are decisions on lesser matters that may not very significant for the voter. For this reason, vote delegation becomes an essential requirement for Agora, such that voters can delegate their vote to someone they trust and that can spend more time researching voting matters. At the same time, the voter will always maintain the ability to override his delegated vote at any time and excercise a direct vote for specific matters, thus retaining control in a flexible manner.

Anybody can create a delegate (or Proxy) in Agora, all that is necessary is for the delegate to be appropriately registered in the system. Examples of delegates could be “Richard Stallman”, “Green Peace”, “The Republican Party”, “Amnesty International”. Votes cast by delegates are public, and they must be cast prior to the direct voting period. This ensures that delegates can never deceive those that would delegate their votes to them; said voters could always override their delegated vote in time if they are unhappy with their delegate’s choice.

Despite the fact the delegates’ votes are public, the choice of delegate itself remains secret. If you decide to delegate your vote to, for example, “Richard Stallman”, that choice of delegate will be secret. If you decide to create your a delegate named “John Doe” and delegate your vote to your own delegate, that choice of delegate will also remain secret, although your vote as the delegate “John Doe” will be public. So in summary, votes cast by delegates are public, but voters’ choice of delegate is not.

The delegated voting system that we have designed is based on the use a parallel and continuous voting phase that tallies votes that represent a voter’s choice of delegate.

Like in any other election, a number of authorities are necessary that will jointly create the public key, and will process the votes as necessary. For a regular vote taking place in congress the options present in the ballot are “YES”, “NO”, “ABSTAIN”. A ballot for an election in the delegated system contains one option per possible delegate. This particular election (which we can call delegate election) never finishes, tallies are repeated periodically, and the options present in the ballot (corresponding to the list of delegates) may change over time as new delegates are registered.

For example, lets consider that today I, Eduardo Robles, delegate my vote to Richard Stallman. What I’m really doing is encrypting my ballot whose content is roughly “My delegate is Richard Stallman” with the public key of the delegate election, signing it with my DNIe, and sending it to Agora that will validate it and store it.

In the next vote in congress, say a decision to modify intellectual property law (IPL), the delegate Richard Stallman casts a vote, for example “Yes”. This choice would be public, and once the direct voting period opens, Richard Stallman would have had to already specify his vote. This would allow me to override my delegated vote with a direct vote, or change my choice of delegate, in case I did not agree with his vote.

Once the direct voting period is over, the tally is carried out for both elections: one corresponding to the specific vote on the IPL, and one corresponding to the delegate election. Say for example that the result in the direct vote is 100,000 YES, 30,000 NO and 1,000 ABSTAIN. Additionaly, in the delegated election, imagine that Richard Stallman received 1,000,000 delegated votes. Because Richard Stallman, as a delegate, published his vote as YES, then those votes would be added to the YES option for the direct vote, resulting in a total of 1,100,000 for YES. The process is exactly the same for all delegates; for each of these their sum total of delegated votes is added to the direct vote option that said delegate publicly selected.

Note that Agora is the first cryptographically secure voting system that supports vote delegation; this represents a genuine contribution. A paper is forthcoming where we will describe this novelty in detail.

Technology

In terms of software components, the Agora project consists of two main parts: the backend and the frontend. The backend is Verificatum, a library whose main developer is Douglas Wikström, a swedish cryptography researcher. We are working on Verificatum to speed it up and allow it to scale massively. For the frontend we will use Agora On Rails, developed by members of PDI. The frontend will need to be modified to accomodate the security scheme we have designed.

Massive Voting

As stated earlier, approximately 6600 votes take place yearly in congress alone. This translates to roughly one vote per hour on average. In the last general election in Spain there were 35 million voters. These two figures together mean that on average we must be able to process 35 million votes per hour to allow all possible voters to participate; if we cannot attain this throughput we could possibly shut out voters, an unacceptable scenario.

We have conducted benchmarks that show that with 3 authorities, a key length of 2048 bits, and a reasonably powerful machine per authority it is possible to process approximately 300,000 votes per hour. However, we are working with the developers of Verificatum on a more optimized version with increased performance, as well as adding the ability to scale out to more machines. Also, we do not rule out the possiblity of processing votes with graphics cards (GPU’s) using OpenCL.

Voting through the internet? Are you mad?!

Internet voting has its dangers, there is no doubt about that. If the client computer is compromised, for example with a virus, it could cast votes incorrectly, not cast them at all, or reveal the voter’s choice, all of these without the voter knowing anything was wrong. In order to address these issues we will create a linux distribution on a LiveCD/LiveUSB so that voters can generate their vote in a secure environment, without an internet connection. They can then cast their vote via the internet once their vote has been encrypted and signed from within the secure environment. Additionally, we will carry out public information campaigns to describe the system, how it works, the possible dangers, and how the DNIe works.

Of course, there is also the issue of coercion, of which we are very aware of. Coercion can take place locally, for example if your boss demands to oversee how you vote to ensure you make the choice he expects. This type of coercion already exists with traditional voting, for example voting through traditional mail. It may also exist if somebody demands to enter the private voting booth with you. More specific to internet voting is the problem of non-local coercion or vote buying, where it is possible for a voter to prove to others what he/she voted without the need for said agents to be present. This is a real problem we are aware of and we are researching ways to mitigate it. Nonetheless, the problem of coercion for high frequency, continuous elections is qualitatively different from those of traditional elections. Voting power is spread out in time accross different elections, and it is harder and more costly for malicious agents to achieve their ends than if power were concentrated in few, far between and general elections. We may present a paper to describe these differences in depth in the near future.

I personally am not a huge fan of internet voting because of the security issues that may arise. I would prefer to have voting booths with a certified and periodically verified (by third parties) computer at each town and city. It would also be desirable not to depend on the police as the single authority that provides and verifies DNIe signatures and certificates. Nonetheless, despite the problems our system may have, the possibility of offering the citizenship a working and practical liquid democracy through PDI seems to far outweigh the potential dangers, and for this reason it is worth purusing this objective, and do things as best as possible.

If things do move forward positively, we will be able to direct many more resources to create a much more secure and robust system including some of the measures I have suggested above, but some of these are too costly initially.

Agora for all and all for Agora

Although the Agora project was born out of a specific need that PDI had to fulfill, we soon realised that it could be a tool that could be useful to many more people, so Agora was recently detached from PDI and stands as an independent project in its own right. Of course, Agora is still a central element of PDI’s strategy, but anybody who shares an interest in the projects is welcome to collaborate on its development. Agora has been from the beginning been and will be free software.

Agora is a software project with a clear aim to improve our democratic system. The project is well underway but still not complete, and is driven by voluntary work donated generously by members of our team. We welcome anyone, developers, researchers, security enthusiasts, designers, or anyone else who shares our vision, to collaborate and help bring this vision closer to reality.