Minutes of IEEE P1622 Meeting February 24-25, 2013 Attendees in Person Kenneth Bennett Kim Brace Megan Dillon Paul Eastman Rita Eaton Lynn Garland (February 24th only) Brian Hancock (February 24th only) Linda Harley Arthur Keller Nick Koumoutseas (February 24th only) John Lindback James Long Tammy Patrick Ian Piper Don Rehill (February 25th only) Mark Skall (February 24th only) Paul Stenbjorn (February 25th only) John Wack David Webber Sarah Whitt Malia Zaman Attendees by Phone Duncan Buell (February 25th only) Charles Corry Kurt Hyde Neal McBurnett John McCarthy John Sebes (February 25th only) Paul Stenbjorn (February 24th only) Beth Ann Surber Call to order The meeting was called to order at 1:05pm on February 24, 2013. Introduction Opening remarks were made by John Wack, who discussed the general goal of the meeting was to discuss the election results reporting standard that the subgroup have been working on. John Wack discussed the aspects of the agenda that would be followed over the next day and a half. He stated that all the materials of these meetings will be made available on the internet. Arthur Keller discussed IEEE policy with regards to the IEEE call for patents. The purpose for this was to ensure that attendees knew how to disclose any patents that could be relevant to the work of this group. Policies and Procedures revisions Arthur Keller discussed the proposed changes to the policy that govern the P1622 group. Specific changes that were highlighted included: Policy with regards to how individuals can become and remain voting members of the P1622 group. Since the group does not meet in person on a regular basis, it was proposed that the criteria to become a member be changed to allow the attendance of teleconferences to count towards the quota. Proposed the creation of an immediate past chair position and discussed this individual’s duty. In short the immediate past chair would assist in advising the current chair on IEEE policies and procedures. The new proposed policy is based of the most recent IEEE template for the formation of a standards committee. Thus ensured that the P1622 group policy is in compliance with IEEE template. The IEEE template proposes to add a rank order to the vice presidents if multiple such positions exist. In this manner, in the absence of the chair, the vice presidents in order would serve in the role of moderating meetings. A new section 6.6 was proposed that dealt specifically with teleconferences. This was necessitated by the fact that attending teleconferences would now be counted towards attaining membership status. Formal votes cannot be conducted via a teleconference. Advanced agenda’s are not required for the teleconferences. In addition, it was proposed that the teleconference meetings needed to be announced 7 days prior to the meeting on the P1622 group web site. Section 8.1 was discussed which covered who has access to draft materials from the working groups. Arthur Keller explained the overall process of how the working groups work, and how information should be shared, how standards will be distributed and shared for comments, as well as how these standards will become finalized and formalized. In short, P1622 is unique in that it will allow the proposed standards to be open for a period to allow for public comments. The working group will consider and incorporate these comments if needed. The P1622 group will take a formal vote on the standards before they become finalized. A Question and Answer session of the proposed policy ensued; with Arthur Keller addressing specific points and highlighting were these issues are addressed in the policy. John McCarthy moved to pass the policy. John Wack seconded. Motion carried unanimously. All voting members voted yes. All observers voted yes, except Paul Eastman and Malia Zaman who did not participate. Election Assistance Commission Brian Hancock introduced Megan Dillon who will be representing the EAC in the P1622 group. Brian Hancock indicated that the EAC is committed to assist the P1622 group, by reaching out to election officials and manufacturers and encouraging these individuals to become involved with the P1622 group. He discussed the role of the EAC and how the standards by the P1622 group may be incorporated into the VVSG. One aspect of this will be the development of conformance and interoperability testing. Mark Skall discussed the difference between conformance and interoperability. Once the schema is created then tests need to be developed to ensure that the voting system components conform to the schema. Interoperability testing will ensure that multiple components of the voting system, in a pair wise manner, are interoperable. David Webber pointed out that interoperability and conformance testing can be mutually exclusive. If a system passes interoperability test, it does not mean that it passes the conformance test, and vice versa. The goal of the P1622 group is to create a schema that would allow for multiple permutations of this testing. However, templates based on the schema would be specific and would test specific aspects and criteria. However, first a standard needs to be set in place before the development of a conformance test. Mark Skall equated this to having a large international standard for which specific profiles are created. The conformance tests can then be created for each of the relevant profiles, before the interoperability tests can be created. David Webber mentioned that it would be beneficial to create small examples that would be part of the standard. Then one could create a test suite to assist in the evaluation of conformance and interoperability. Brian Hancock explained that cost plays an important factor in this. The outcome of these testing procedures would have to be cost effective, and therefore would be a major criteria during the test development phase. Kenneth Bennett discussed the need and desire of election officials to be able to purchase components for voting systems, and that these components had to meet the conformance testing. Eventually the goal would be that these components could be certified individually. John McCarthy proposed that there are several stages that have been identified in order to achieve this goal. These stages should be document to assist with the discussions moving forward. Sarah Whitt reminded the group that if changes are made to the current method of testing voting systems/equipment, that this might require policy changes in some states. Action Item: This discussion concluded by the group informally reaching a consensus that it would be beneficial to document these stages to assist in mapping out the path between where the group currently is and achieving a standard for interoperability and conformance testing of components. Election results reporting standard John Wack started this discussion by stating that the P1622 are in need of more individuals who are familiar with XML, in order to assist in the development, prototyping and testing of the XML schema. He introduced the individuals who currently serve on the task force, for the purpose of working on the PAR that deals with election results reporting. The purpose of this task force was to pick an interoperability task that should be easy to do. There is a desire to produce a standard in short order, for the purpose of gaining traction in the voting community. Three use cases have been developed: Reporting on election night, county roll up and post election. The goal would be to address all three use cases in one schema. The schema is being designed for today’s needs with a look towards future possibilities. At a minimum the standard should incorporate what is already required in the VVSG, but then go beyond these requirements. For example the ability to report overvote/undervote counts. In addition, the addition of tags is proposed in this standard, to serve as a method to assist election officials with reporting needs. John Wack noted one question that has to be discussed is whether the standard should be broadened to include tagging for individual machine level reporting. In addition the question was raised whether this standard needs to focus only on election night and county roll up, or the third use case as well. Discussion continued on this point for some time, with various individuals offering examples of how they may use this tagging. However, the decision as to what was focused on was tabled to be discussed after the overview and presentations of the current schemas was completed. Arthur Keller stated that the scope of the PAR is election reporting in general. Therefore, if a scope of change occurs, then the PAR will need to be revised. John Wack continued the overall introduction, and explained that one of the considerations that had to be made was the potential size of these files. One solution to reduce file size was by including identifiers. The other important requirement was the flexibility to analyze the data on virtually any level. Kim Brace gave a brief presentation on “Basic Election Administration” to assist the group with understanding the larger context of the work. Diversity of laws, and processes, etc. is the key point. There are 50 states with 50 different election laws. Each county is responsible for interpreting and implementing those laws. In the US there are ~3000 counties. In total there are 10072 different jurisdictions, each doing something different. The mean size of jurisdiction in the US is 1492 registered voters. Over 1/3 of the counties have less than 10,000 voters. Only 343 counties have more than 100,000 voters. Only 14 counties have more than 1 million voters. However, geography plays another role in complicating things. Cities can span across different counties or districts. There is this geographical over lay to divide people, and often the borders don’t line up. The smallest building block that election officials work with are ballot groupings, this is the bottom level. In order for analysis to be conducted on any level, and counts aggregated correctly, the schema will have to go down to the level of ballot groupings. David Webber presented the current version of the schema that the task force has created. The XML information was presented in the format of a mind map to assist in visualizing the structure. There were a few icons on the map that needed clarification. ? indicated that this block is optional. * indicated that multiples of these block can exist (example Contest) @ indicated that it is an attribute that is being described. This XML is based off of the EML 520 which focuses on the voting numbers. There is also the possibility of using EML 530 which focuses on the percentages and can be utilized for analysis. The introduction section of the schema follows that recommended by EML 520, since this information is all housekeeping items, such as who generated the file, etc. The purpose of this schema is to be versatile enough to be able to aggregate the scores across cities, precincts, counties, and states. One key aspect in making this work, is to have a jurisdiction block that serves as assisting in identifying the geographical location. This information would be set well in advance of election night, since this will be based directly from the ballot groupings that have been created. This information is unlikely to change on election night. However, what will change is the contest section. There were a few aspects in the contest section worth noting: Contest Metrics: This would be unique to the proposed standards, in that it could be possible to count the number of overvotes/undervotes occurring for a specific contest. There may be others, such as blank votes that also need to be considered. Contest Counts: This bucket would be used to report such things as the number of contests that have been counted. Identification: Used to identify the contest. One concern that was raised, was the ability to have a method for recording the status of whether a precinct has completed the voting. This could be done by adding a status field to the precinct level under jurisdiction bucket. As well as the possibility of making note of the total number of registered voters. Arthur Keller addressed that there are specific questions that need to be answered in order for the task force to move forward successfully: Whether the scope of the work should be limited to election night reporting only, or even down to individual machine reporting. Whether or not the use cases are appropriate. Reporting and whether jurisdictions have one or multiple reports. John Wack wrapped the first day meeting up by stating that the task force has a good structure in place, but now the questions are whether there is too much or too little details. The meeting was adjourned to resume Monday Morning, Feb 25th at 8:30am. February 25, 2013 Meeting was called to order at 8:48am. Arthur Keller conducted the roll call. Then proceeded to list those individuals who are eligible to become members of the P1622 group. The following individuals became members: Kenneth Bennett, Charles Corry, Linda Harley, Kurt Hyde, Sarah Whitt. Officer Elections Paul Eastman chaired the proceedings for officer elections. Chair Arthur Keller nominated John Wack as chair. Beth Ann Surber seconded. John Wack was appointed chair of the P1622 group by unanimous appointment. Vice Chair John Wack nominated Neal McBurnett for first Vice Chair and Kenneth Bennett for second vice chair. Sarah Whitt seconded. Neal McBurnett and Kenneth Bennett were elected for first and second Vice Chair respectively by unanimous appointment. Secretary John Wack nominated Linda Harley for secretary. Beth Ann Surber seconded. Linda Harley was appointed secretary by unanimous appointment. Immediate past chair Arthur Keller will serve in this role. Sponsored standards committee Arthur Keller explained that thus far he has been representing the P1622 group on the standards activity board (SAB). What is currently being proposed is to create a sponsor committee that would oversee the work conducted by the P1622 working group. Arthur will be working with the SAB to create this sponsored committee. The purpose of this sponsored committee would be to have a broader scope to address other aspects and issues of voting standards. It was noted that individuals could serve on multiple working groups. Paul Eastman noted that with the creation of a sponsored standards committee it would allow for the broadening of scope and any additional future work that would need to be conducted with regards to voting. Sarah Whitt asked what would be the scope of the sponsored committee. Arthur Keller explained that this committee would focus on the larger aspects of voting systems especially electronic data interchange. This provides an opportunity to work on other standards, for example usability. John Lindback asked how expanding the work of this new committee would integrate with the work being done by the EAC. Arthur Keller explained that the IEEE would serve in a support role to assist the EAC with the development of testing standards. This new group would continue to work closely with the EAC. Malia Zaman made the point that the process to create a sponsored committee could take between 6 months to a year to be created by the IEEE. The IEEE SAB meeting is in June, and the deadline to get the scope and policies in would be early to late April. Malia Zaman explained that once approved by the SAB, and then it needs to be approved by the ADCOM before it becomes an official group. The entire process can last up to a year. Arthur Keller stated that he will be working on the draft documents for this sponsored committee, and would send it out to the P1622 group for feedback. John Wack explained that there are other standards that we could be working on that is not specific to equipment, which would make it a bigger effort and fall outside of the scope of the P1622 group. For example, common terms and definitions. By working on a unified taxonomy, it would assist in the standards development process. Continuation of the election results reporting standard discussion Don Rehill provided the perspective of what the Associated Press (AP) does for election night reporting. He is the director of elections and tabulations on election night for AP. The purpose of his division is to tabulate all of the election results on election night and then release it to the media outlets. AP takes in the information from counties and voting jurisdictions, some at the precinct level, and convert this information to a common vocabulary before providing the information to the media clients. On election night AP has stringers out at the precincts to talk directly with election officials and provide counts directly back to AP over the phone. AP has a database that takes in the information from multiple sources and processes it, checking to ensure that for example there are not more votes reported in that jurisdiction than actual registered voters. These logic checks assist in ensuring the validity of the numbers. This database and counts are constantly updated on election night. Currently AP also takes in feeds from large counties, that have the data available in XML, flat or messages formats. AP compares the feeds across multiple sources, and then picks the ones that are more reliable sources. What they then send out to the media is their best judgment of what is the most accurate information. This information is perishable, in that the media clients want to get the most direct information in order to call races and move on. One key thing with election night voting is classifying the status. That is are we part way done with this race? How do we represent that? This information is used to call races. In the past the method has been to report for example 60 out of 400 precincts votes are in to indicate progress. Sarah Whitt asked whether this requires AP to keep track of the different contests and elections and how they vary from one jurisdiction/precinct to the nest? Don Rehill stated that the research group compiles all of that information months in advance, in order to understand the rules and definitions at county and state levels. Some of the information is hard to obtain, since you don’t know the total number of ballots cost, turn out, etc. Including a progress indicator into the schema will assist in this. Paul Stenbjorn said that the amount of advance data that AP collects is something that has not been discussed in the P1622 group. Does AP have a data scheme that they use? Don Rehill answered by indicating that AP has a database that they use to compile the information. The structure is initial build with tentative numbers until they are able to confirm them on election night. These are the zero files, and they need to capture this data months in advance. Kenneth Bennett pointed out, that the concept of x over y, for percentage completion of the race might be inaccurate, since this does not always include the mailed in ballots. Don Rehill indicated that they address this by finding a ceiling number, which is based on the number of ballots that were mailed out. They flag those as the “a” report or the absentee report. In the days after election night, AP keeps track of these until the race is called. John Wack this sound like an area where we can do some work, in order to determine if there are some other associated metrics that could be of use. From Don Rehill’s, discussion it sounds as if AP will always be looking at multiple feeds for information, but the purpose of the p1622 schema should be to assist in processing the numbers as quickly as possible. The schema should include a place for ancillary notes to keep track of these special cases. David Webber stated that this flexibility has been built into the current draft of the schema. You can create all of the information but it will need to be tagged. It would be useful to see any of the work that AP is able to share. We will be mocking up the schema and putting it out there fore feedback. So having AP look at it will be greatly beneficial. John Wack directed the conversation next to consider the list of questions still before the P1622 group that need to be decided on in order to assist the task force with moving forward. As a recap, changes to the EML520 schema has been made, and it include many of the things Don Rehill discussed. There are multiple buckets, and ability to link identifiers to vote results. The question is, are we trying to do too much in the current PAR or should we be focusing only on election night reporting. Arthur Keller stated that as to whether to have multiple new schemas or one large schema is a technical decision that the technical experts should determine. However, in terms of the use cases, Arthur Keller proposed that use case one be divided into a state reporting and county reporting levels, as this may have different granularity of the information. Paul Stenbjorn agreed that there may be different use cases but it does not necessitate different schemas. We want a common data format, and having multiple schemas moves away from this goal. Arthur Keller clarified that he did not suggest different schemas, but different use cases, as the data feeds may be different. David Webber stated that he prefers the three use cases. We could write Use Case One as and a and b when providing examples for the state and county levels on how the schemas could be used. We have a medium level of complexity in the draft schema. What we will need to provide in the documentation is examples. In addition, it is important that we provide guidelines in how to properly create the IDs and how they are intended to be used. Sarah Whitt remarked that if we start dividing up the current use cases, they could explode simply because of the different combinations one could view this data as. Therefore, it is preferable to keep the same like things together. We are balancing complexity with simplicity. Keeping the use cases simple and then providing different examples is probably the way to go. Paul Stenbjorn suggested that the difference is not between state and county, but the real difference lies between the tabulation and aggregated counting. This is why conceivably one could get election reporting down to the individual machine level. David Webber stated that the intent for the draft schema was to start with election night reporting based off the EML 520. We should focus on this effort first, before starting to add to and build out the schema further. Beth Ann Surber agreed and stated that it is important to get something out there for the states to start using. Neal McBurnett stated that we should stick with the existing PAR, which does not only focus on election night reporting, but also on the other overlapping areas. James Long stated that to implement use case one it is dependent on use case two. If we ignored use case two, we would constantly be changing and updating the schema. We need a common data format, and we need to keep that goal in mind. Kenneth Bennett voiced his concern that if there is a single schema for everything, that it might be unwieldy and implementation would be a concern. Event logging standard Duncan Buell explained the research he was conducting with DREs and event logging. Often times the time stamps of the DREs have no meaning or do not even exist. Therefore it is not possible to determine the length of time that individuals were using the DRE. It would be useful to be able to log this information and compare it to the amount of time that voters stand in line. In addition there are several human factors issues with DREs that can be evaluated and tracked with event logging. James Long stated that a lot of the legacy equipment is not nearly up to current equipment abilities, and thus for some systems it will not be ever able to obtain this information. For these scenarios we could definitely have some buckets in the schema. According to the VVSG there are 6 times of measures that need to be made. Ian Piper agreed that having buckets would be easy and useful to include in the schema. However, the biggest problem is going to be to create a common taxonomy/lexicon that all can agree on, and thus become the information that is placed in those buckets. With a common lexicon analysis tools could be created to assist with the evaluation and reporting of these machines. However, agreed that for the legacy systems there is no solution. James Long said that providing buckets in the schema should be easy. However, creating a common taxonomy/lexicon is outside the scope of the current project. Arthur Keller agreed that it would make sense to create a new PAR to deal with this issue. Arthur Keller moves that Duncan Buell, James Long, Kenneth Bennett, Paul Stenbjorn, Ian Piper and Kim Brace work on drafting the purpose and scope of a new PAR to look at event logging. Sarah Whitt seconded. Passes by unanimous vote. Arthur Keller moves to amend the existing PAR for election reporting and strike the words “schemes 510, 520, and 530, which contain.” Linda Harley seconded the motion. Charles Corry moved to amend the amendment and also strike the words “Oasis EML Version 7.” Kurt Hyde seconded the motion. Discussion ensued. John Wack stated that the wording currently does not imply that we are restricted to EML, but that we are focusing on it. Paul Eastman suggested that the PAR not be changed as it may open it up to scrutiny. Malia Zaman explained that the rules for PARs have changed. In that if the scope is decreasing this is all right, but if the scope is increasing that it will be scrutinized. The deadline to have revisions and PAR into the SAB is May 3, in order for things to be voted on at the June meeting. Sarah Whitt moves to postpone the vote until after lunch. No one seconded, so motion to postpone fails. The amendment to the amendment was to strike “Oasis EML Version 7.” 3 voted aye. Everyone else voted nay. No one abstained. Amendment to the amendment failed. The amendment was to strike “schemes 510, 520, and 530, which contain” 15 voted aye. 1 voted nay. No on abstained. Motion to amend the PAR passed. Open Source Digital Voting Anne O’Flaherty presented on work that Open Source Digital Voting (OSDV) is doing in collaboration with Virginia State. OSDV is a pending non-profit organization working towards open source voting resources. Trust The Vote is supported by OSDV. Anne O’Flaherty is a project manager of the project with Virginia State. There are many stakeholders including election officials, donors, and advocates. The mission was to develop publicly owned open source software for voter registration all the way to election reporting. With Virginia they created a digital ballot that was printed, signed and then mailed in. Microsoft was the hosting vendor. They adapted online tools to assist with this. They used and implemented EML410, which is a blank ballot format. For details, please refer to the power point slides submitted by Anne. PEW John Lindback talked about some research that PEW has conducted. PEW found that there are 3 components that are suggested by academics, election officials and advocacy groups that would improve the accuracy and accessibility of voter registration: online registration, greater automation of the DMV and electronic registration information center (ERIC). ERIC is a tool for election officials to use to make their lists more accurate. The software was developed by IBM and includes sophisticated analysis that can calculate the odds of whether this person is the same person over multiple records. About 1/8 of people move every year. Last year 7 organizations went through the process of creating ERIC. ERIC is now run by the states and no longer by PEW. They will be on their own finances and staffing as of June. PEW will continue to assist on an as needed basis. States have submitted voter registration data and DMV data into a database. ERIC also contains information from 80 million social security data that assist with the matching. ERIC would generate a list to clients that would show them DMV individuals who are able to vote but are not registered. Last year and outreach resulted in over 300,000 individuals registering as voters. The lists will also assist in flagging individuals who have moved, even across state. All of the data being submitted to ERIC assume that the DMV is sending over records that exclude minors and non-citizens. NIST election data model development John Wack presented work that NIST is doing with regards to creating a data model for common data format to assist in visualizing the information. Data models assist with looking at the information from a more process oriented, perspective followed by expected flow of operations and concept of operations. There are a lot of advantages of looking at the general problem first and then doing a gap analysis. It would be useful to develop a model of common subsets and looking at the sub areas. Off this model you can then hang the schema to see where it fits in and identify gaps. This would help facilitate communication with non-technical experts. UML is a modeling language and is what was used in the current data model of the election process. Kenneth Bennett stated that having this model really assist in understanding the big picture. This is a great step in assisting individuals with understanding the schemas and how it can assist with interoperability. Kenneth Bennett has a few edits but will talk about this off line with John Wack. David Webber cautioned that you could generate XML from UML but there is a lot of work that would have to go into that in order to conceptually get all the details down, and then it may still change because process differ widely. However, it is useful to assist in conceptualizing what is being proposed. Other business Kenneth Bennett suggested that the next meeting of the group could be held at LA County. Arthur Keller motioned that the voter registration discussion be limited to 5 minutes. The motion passed by unanimous acclimation. The following individuals volunteered to assist with the creation of the scope of a PAR for voter registration: Sarah Whitt, Kim Brace, Paul Stenbjorn, David Webber, Don Rehill, and John McCarthy. Discussion ensued about aspects of voter registration and the need for standards in this area. Kim Brace gave a presentation on a task force that he is involved with investigating the long lines in Prince William County in Virginia. Meeting was adjourned at 5:00pm.