Exam 1 Review CS411 Follow the instructions or lose a point • One point total, not one point per mistake • There are no “gotcha” instructions. They’re things like write your name on every page and make sure it matches what’s on gradescope • This is one of the key learnings of this class. I’m not being arbitrary Software Engineering CS411 What is software? • A program is made out of code functions • Software is made out of programs • It is very likely that you haven’t written any software at this point in your career • You might be a professional programmer and never have to write software • But you will (should) still use the tools and principles of software engineering What is software engineering? • “A systematic approach to the analysis, design, assessment, implementation, test, maintenance and re-engineering of software, that is, the application of engineering to software.” • Software engineering encompasses methodologies and principles related to Conceiving Communicating Specifying Designing Documenting! Building (meh) Designing Testing Maintaining Defining Issues A Process Framework Boston University Slideshow Title Goes Here ▪ A process framework is a set of guidelines, work products, and tools that attempt to facilitate a process ▪ For software engineering in general, the framework comprises • Communication • Planning • Modeling • Construction • Deployment CS411 Software Engineering 6 Defining Issues Features vs Benefits Boston University Slideshow Title Goes Here ▪ These are from Entrepreneur magazine Features Benefits Self-Setting clock Works out of the box 50-number speed dial Don’t have to remember phone numbers One-touch financial reports Speed and ease Batteries included Don’t have to have / buy them Open 24 hours Convenience of being able to shop whenever CS411 Software Engineering 7 Defining Issues A Layered Technology Boston University Slideshow Title Goes Here tools methods process model a “quality” focus Software Engineering CS411 Software Engineering 8 Traceability • Documentation describing how • Requirements are linked to business needs • Design / tests are linked to requirements • Specifications linked to design • Code linked to specifications The Software Development Life Cycle CS411 Defining Issues Bottom line Boston University Slideshow Title Goes Here ▪ Software development is a planned activity ▪ We use process to both manage the software development lifecycle (SDLC) activities and to improve the process itself ▪ Each company uses its own SDLC model but all of them are essentially Define, Design, Develop, Deliver, DMaintain ▪ The differences are often of scale and/or concurrency of activities CS411 Software Engineering 11 Defining Issues Plan-driven vs agile processes Boston University Slideshow Title Goes Here ▪ Plan-driven, or prescriptive, processes are processes where all of the process activities are planned in advance and progress is measured against this plan — they strive for an orderly approach to development. ▪ In agile processes, planning is incremental and it is easier to change the process to reflect changing customer requirements. ▪ In practice, most practical processes include elements of both plan-driven and agile approaches. ▪ There are no right or wrong software processes. CS411 Software Engineering 12 Defining Issues Plan-driven vs agile development Boston University Slideshow Title Goes Here CS411 Software Engineering 13 Defining Issues The waterfall model Boston University Slideshow Title Goes Here Define, Design, Develop, Deliver CS411 Software Engineering 14 Defining Issues Problems with the Waterfall (I) Boston University Slideshow Title Goes Here ▪ Complete up-front specifications with sign-off - Research showed that 45% of features created from early specifications were never used—with an additional 19% rarely used [Johnson02]. - Over-engineering, a study of 400 projects spanning 15 years showed that less than 50% of the code was actually useful or used [CLW01]. CS411 Software Engineering 15 Defining Issues Problems with the Waterfall (II) Slideshow Title Goes Here ▪ Boston LateUniversity Integration and Test - The waterfall pushes this high-risk and difficult issues toward the end of the project. Waterfall is called fail-late lifecycle. ▪ Reliable Up-front Estimates and Schedules - Can not be done when the full requirements and risks are not reliably known at the start, and high rates of change are the norm. ▪ “Plan the work, work the plan” values - Limited value for high change, novel, innovative domains such as software development. CS411 Software Engineering 16 Defining Issues Waterfall model problems Boston University Slideshow Title Goes Here ▪ Inflexible partitioning of the project into distinct stages makes it difficult to respond to changing customer requirements. - Therefore, this model is only appropriate when the requirements are well-understood and changes will be fairly limited during the design process. - Few business systems have stable requirements. ▪ The waterfall model is mostly used for large systems engineering projects where a system is developed at several sites. - In those circumstances, the plan-driven nature of the waterfall model helps coordinate the work. CS411 Software Engineering 17 Defining Issues Incremental development Boston University Slideshow Title Goes Here CS411 Software Engineering 18 Defining Issues Incremental development benefits Boston University Slideshow Title Goes Here ▪ The cost of accommodating changing customer requirements is reduced. - The amount of analysis and documentation that has to be redone is much less than is required with the waterfall model. ▪ It is easier to get customer feedback on the development work that has been done. - Customers can comment on demonstrations of the software and see how much has been implemented. ▪ More rapid delivery and deployment of useful software to the customer is possible. - Customers are able to use and gain value from the software earlier than is possible with a waterfall process. CS411 Software Engineering 20 Defining Issues Incremental development problems Boston University Slideshow Title Goes Here ▪ The process is not visible. - Managers need regular deliverables to measure progress. If systems are developed quickly, it is not cost-effective to produce documents that reflect every version of the system. ▪ System structure tends to degrade as new increments are added. - Unless time and money is spent on refactoring to improve the software, regular change tends to corrupt its structure. Incorporating further software changes becomes increasingly difficult and costly. CS411 Software Engineering 21 Defining Issues Boston University Slideshow Title Goes Here ▪ When would you use agile and why? ▪ When would you use waterfall and why? CS411 Software Engineering Defining Issues The extreme programming (XP) release cycle Boston University Slideshow Title Goes Here ▪ ` CS411 Software Engineering 23 Defining Issues The Hierarchy Boston University Slideshow Title Goes Here 1. Theme / Initiative - Make more money off of grid pages 2. Epic - Make discovery “more fun” 3. User Story - I am a frequent shopper and I just need something and I want to get replacement-level value fast 4. Task - Increase product review visibility ▪ Subtask / Ticket—This is what you log your time against - Increase font size backend - Color picker backend - Launch color / font size A/B test - … CS411 Software Engineering Defining Issues Incremental development vs delivery Boston University Slideshow Title Goes Here ▪ Incremental development - Develop the system in increments and evaluate each increment before proceeding to the development of the next increment; - Normal approach used in agile methods; - Evaluation done by user/customer proxy. ▪ Incremental delivery - Deploy an increment for use by end-users; - More realistic evaluation about practical use of software; - Difficult to implement for replacement systems as increments have less functionality than the system being replaced. CS411 Software Engineering 30/10/2014 Chapter 2 Software Processes 25 Defining Issues Incremental delivery (to the customer) Boston University Slideshow Title Goes Here CS411 Software Engineering 30/10/2014 Chapter 2 Software Processes 26 Defining Issues Incremental delivery advantages Boston University Slideshow Title Goes Here ▪ Customer value can be delivered with each increment so system functionality is available earlier. ▪ Early increments act as a prototype to help elicit requirements for later increments. ▪ Lower risk of overall project failure. ▪ The highest priority system services tend to receive the most testing. CS411 Software Engineering 30/10/2014 Chapter 2 Software Processes 27 Defining Issues Incremental Delivery: Move from Deliverable to Deliverable Boston University Slideshow Title Goes Here ▪ Don’t build the left half of a car ▪ Build a skateboard - Good enough. Do something else - Don’t want it. Do something else - Not fast enough. Give it a motor - Not steerable enough. Make it a scooter ▪ Bit you stand on is tested and retested at each milestone CS411 Software Engineering Defining Issues Incremental delivery problems Boston University Slideshow Title Goes Here ▪ Most systems require a set of basic facilities that are used by different parts of the system. - As requirements are not defined in detail until an increment is to be implemented, it can be hard to identify common facilities that are needed by all increments. ▪ The essence of iterative processes is that the specification is developed in conjunction with the software. - However, this conflicts with the procurement model of many organizations, where the complete system specification is part of the system development contract. CS411 Software Engineering 30/10/2014 Chapter 2 Software Processes 29 Containerization and Orchestration Part 1 of 3 CS411 Spring 24 Peter B. Golbus deploy: name: deploy needs: tests runs-on: ubuntu-latest steps: - name: Deploy to server uses: appleboy/ssh-action@v1.0.3 with: host: ${{ secrets.SERVER_HOST }} username: ${{ secrets.SERVER_USER }} key: ${{ secrets.SSH_PRIVATE_KEY }} script: | screen -r -d birthday-bot PID=$! kill $PID rm -rf .venvs/birthday-bot/ python -m venv .venvs/birthday-bot source .venvs/birthday-bot/bin/activate cd birthday-bot git pull git checkout no_docker git rebase main pip install -U pip pip install -r requirements.lock python bot.py& • What if I want it on a different box? • What if I install a breaking update? I want to package the code with its requirements! birthday-bot deploy process Step 1: Push “an image” to dockerhub push: name: push needs: tests runs-on: ubuntu-latest steps: - name: checkout uses: actions/checkout@v3 - name: set github envirnment variable run: echo "VERSION_NUMBER=${GITHUB_REF#refs/*/}" >> $GITHUB_ENV - name: setup docker uses: docker/setup-buildx-action@v2 - name: login uses: docker/login-action@v1 with: username: ${{ secrets.DOCKER_USERNAME }} password: ${{ secrets.DOCKER_PASSWORD }} - name: build run: | docker build . --file Dockerfile --tag pgolbus2/cs411-birthdaybot:${{ env.VERSION_NUMBER }} - name: push run: docker push ${{ secrets.DOCKER_USERNAME }}/cs411-birthday-bot:${{ env.VERSION_NUMBER }} Step 2: Pull and “run” the image deploy: name: deploy needs: push runs-on: ubuntu-latest steps: - name: checkout uses: actions/checkout@v3 - name: set github envirnment variable run: echo "VERSION_NUMBER=${GITHUB_REF#refs/*/}" >> $GITHUB_ENV - name: Deploy to server uses: appleboy/ssh-action@v1.0.3 with: host: ${{ secrets.SERVER_HOST }} username: ${{ secrets.SERVER_USER }} key: ${{ secrets.SSH_PRIVATE_KEY }} script: | echo ${{ secrets.DOCKER_ACTUAL_PASSWORD }} | docker login -u "${{ secrets.DOCKER_USERNAME }}" --password-stdin docker pull pgolbus2/cs411-birthday-bot:${{ env.VERSION_NUMBER }} docker stop birthday-bot || true docker rm birthday-bot || true docker run -d --name birthday-bot --env "DISCORD_BOT_TOKEN=${{ secrets.DISCORD_BOT_TOKEN }}" --rm -it pgolbus2/cs411-birthday-bot:${{ env.VERSION_NUMBER }} Infrastructure as Code • Infrastructure as Code (IaC) is an approach to managing and provisioning infrastructure resources using machine-readable configuration files or scripts. • You don’t install stuff on a box, you write out the and push the configuration • Reproducible • Scalable • Automatable • Version controlled Containerization • Definition: Containerization is a lightweight form of virtualization that allows applications and their dependencies to be packaged and run in isolated environments called containers. • Overview: • Containers encapsulate the application, along with its runtime, libraries, and dependencies, into a single unit. • They provide consistent and reliable execution across different computing environments. • Containerization enables efficient resource utilization, faster application deployment, and improved scalability. • Containers are isolated from the underlying infrastructure, ensuring application portability and compatibility. Containerization vs Virtualization Fake hardware Fake Operating System Containers VMs Lightweight Strong isolation Faster start up and scaling Configurable Resource Utilization Efficient Resource Utilization Dedicated Hardware Resources Isolation without overhead Live Migration VMs remain very important! Docker components • Docker Image • An IaC “container” template • Docker (Runtime) Container • A runnable instance of an image • Docker Engine • The runtime environment that builds images and runs containers • Docker Registry • A repository for storing docker images Docker Images • A Docker image is a read-only template that contains the application code, runtime, libraries, dependencies, and configuration files required to run an application. • Characteristics: • Immutable: Docker images are immutable, meaning they cannot be modified once created. Any changes to the image require building a new image based on the updated configuration or code. • Portable: Docker images are portable and can be shared and run on any system that has Docker installed, regardless of the underlying infrastructure or operating system. • Hierarchical: Docker images can be based on or layered upon other images. This hierarchical structure allows the reuse of base images and the incremental building of custom images. Docker Container • A Docker container is a runnable instance of a Docker image. It is a lightweight, isolated environment that encapsulates the application and its dependencies, allowing it to run consistently across different environments. • Characteristics: • Runnable: Docker containers are the executable instances created from Docker images. They can be started, stopped, and managed independently. • Isolated: Each Docker container is isolated from other containers and the underlying host system, providing process-level isolation and resource constraints. • Stateful or Stateless: Docker containers can be designed to be stateful, where they preserve data across container restarts, or stateless, where they don't retain any data once stopped. What is the difference between building an image and running a container? • Make sure you can answer some version of this question • Building an image • Create a runnable template • Running a container • Create and launch an instance of the template Think classes and objects, approximately Department of Computer Science CS411 Version Control With git @perrydBUCS 1 Defining Issues What we're trying to solve Boston University Slideshow Title Goes Here ▪ Track all changes to code base ▪ Rewindable history ▪ Collaboration ▪ Keep dev, test, prod code separate ▪ Work on features and fixes nondestructively CS411 Software Engineering 42 Defining Issues Boston University Slideshow ▪ Decentralized (git) Title Goes Here ▪ Each dev has a copy of the code base ▪ Concurrent work on a file is possible ▪ Local versions (called branches) are embraced ▪ Advantages: ▪ No locking of files ▪ Concurrency ▪ Simple branching ▪ Disadvantages: ▪ Local branches must be merged ▪ History can become complex CS411 Software Engineering 43 Defining Issues git concepts Boston University Slideshow Title Goes Here ▪ git records local changes made to files in a directory (and its subdirectories) ▪ Those records are essentially snapshots of the state of all files at a given moment ▪ Multiple concurrent histories, called branches, are used to isolate specific work, for example a bug fix or new feature ▪ Two branches can be merged together, combining all of the changes made to both branches ▪ Local copies can be synchronized with other developers’ local copies, or with branches stored on GitHub CS411 Software Engineering 44 Defining Issues Detecting conflicts Boston University Slideshow Title Goes Here ▪ What if two developers change the same file? 1. They changed different parts of the file and you can use both 2. They made overlapping changes ▪ git can tell you if this happened, but you must decide what to do about it CS411 Software Engineering Defining Issues Commits Boston University Slideshow Title Goes Here ▪ git doesn’t store the current codebase, it stores code deltas ▪ You can get to the current codebase by applying the deltas to the starting point ▪ So how do you detect conflicting deltas? ▪ Deltas are stored as “commits.” Commits contain the code delta and a function describing the current state CS411 Software Engineering Defining Issues Boston University Slideshow Title Goes Here CS411 Software Engineering Defining Issues Merkle Trees (1979) Boston University Slideshow Title Goes Here ▪ Can I detect if my copy and your copy of L1–L4 match w/out comparing them? ▪ Yes. Compare the top hash That there exists an algorithm to • Detect a conflict • Find the conflict efficiently You do NOT need to know details of the algorithm CS411 Software Engineering Defining Issues Branch Boston University Slideshow Title Goes Here ▪ A branch is a pointer to the head of a series of commits - A history, and - A HEAD CS411 Software Engineering Defining Issues Merge Boston University Slideshow Title Goes Here ▪ Merge main into <feature> ▪ This creates a new commit CS411 Software Engineering Defining Issues Rebase Boston University Slideshow Title Goes Here ▪ Rebase <feature> onto main ▪ Existing commits are incorporated into branch history CS411 Software Engineering Defining Issues merge vs merge --squash Boston University Slideshow Title Goes Here ▪ merge preserves all commit history By convention, GitHub will always display the parents as: • The first parent, which is the branch you were on when you merged, • and the second parent, which is the commit on the branch that you merged in. CS411 Software Engineering git - What is the difference between merge --squash and rebase? - Stack Overflow https://stackoverflow.com/a/25519606 Defining Issues merge vs merge --squash Boston University Slideshow Title Goes Here ▪ merge --squash combines merge history into a single commit CS411 Software Engineering Defining Issues Merges create a commits Boston University Slideshow Title Goes Here ▪ A correct answer to a question like what does a merge do must describe the new commit CS411 Software Engineering Defining Issues <branch> vs origin/<branch> vs origin <branch> Boston University Slideshow Title Goes Here ▪ origin is the computer with the canonical repo ▪ origin <branch> is the canonical version of the branch on origin ▪ origin/<branch> is my current snapshot of origin <branch> as of the last time I pulled CS411 Software Engineering Department of Computer Science CS411 Using (RESTful) APIs @perrydBUCS & ChatGPT 1 Defining Issues Application Programmer Interfaces (APIs)... Boston University Slideshow Title Goes Here ▪ Allow two different systems to communicate with each other ▪ Provide a standardized interface for exchanging data between systems ▪ Allow developers to leverage the functionality of existing systems, making it easier to build complex applications ▪ Allow for easier maintenance and updates of software systems, as changes can be made without breaking the API interface ▪ Bottom line: an API is a black box contract a là class-based design CS411 Software Engineering Defining Issues Interacting with APIs Boston University Slideshow Title Goes Here ▪ Every time you import a package, you are using its API ▪ It is also common for APIs to use HTTP requests and responses to exchange data CS411 Software Engineering Defining Issues REST Boston Slideshow Goes Here ▪ RESTUniversity is a concept that Title describes a set of constraints and principles for building web services. ▪ Was introduced by Roy Fielding in his 2000 doctoral dissertation, where he defined it as a set of architectural constraints that emphasizes scalability, simplicity, and generality. ▪ Based on the idea of resources, which are identified by URIs (Uniform Resource Identifiers), and can be manipulated using a small set of well-defined HTTP methods, such as GET, POST, PUT, and DELETE. ▪ A popular choice for building web services due to its simplicity, scalability, and interoperability, and is widely used by many popular web services, such as Twitter, Facebook, and Google. ▪ However, it's important to note that not all web services that use HTTP and URIs are RESTful, and adhering to the principles of REST can sometimes require trade-offs between simplicity and performance. CS411 Software Engineering Defining Issues HTTP methods for APIs Boston University Slideshow Title Goes Here ▪ The HTTP protocol (set of rules) specifies 8 primary methods (verbs) and some flavors of those ▪ For API work we'll usually just use four: ▪ GET: to retrieve data from the server ▪ POST: to submit data to the server to create a new resource, such as when creating a new record in a database ▪ PUT: to update an existing resource on the server ▪ DELETE: to delete a resource or collection of resources on the server ▪ The astute student will note that these map to the CRUD operations for data CS411 Software Engineering Defining Issues Data formats ▪ You'll see data in various formats, including XML, CSV, and JSON Boston University Slideshow Title Goes Here ▪ Each format has strengths and weaknesses ▪ Which one you use often is dictated by the API vendor ▪ Nearly always you'll use a package / library to handle this type of data ▪ The goal is serde (serialize / deserialize) - serializing data into some raw form - transmitting that data over a network (or filesystem) - de-serializing that data into some language-specific model (object, dict, …) ▪ Here's a quick look at the three most common ones... CS411 Software Engineering Defining Issues CSV / TSV Boston University Slideshow Title Goes Here ▪ Comma-Separated Values, Tab-Separated Values: a simple, text-based data format that is used to represent tabular data, such as spreadsheets or databases. ▪ In CSV, each row of data is represented as a line of text, with each value separated by a comma. ▪ The cool thing about CSVs is that they can also be opened in excel ▪ To parse CSV data in a JavaScript application, you can use a third-party library, such as Papaparse or csv-parser, to read the CSV data and convert it into a JavaScript object or array. firstName,lastName,age,street,city,state,zip,type,number Main St,Anytown,CA,12345,home,555-1234 ▪ An example: John,Doe,30,123 Jane,Smith,25,456 Oak Ave,Somewhere,FL,54321,work,555-5678 CS411 Software Engineering 62 Defining Issues JSON Boston University Slideshow Title Goes Here ▪ (JavaScript Object Notation): a lightweight, text-based data format that is easy to read and write, and is widely used in web APIs. ▪ JSON data consists of key-value pairs, where keys are always strings, and values can be strings, numbers, booleans, arrays, or objects. ▪ The cool thing about JSON is that it is far simpler than XML and far more expressive than CSVs ▪ To parse JSON data in a JavaScript application, you can use the built-in JSON.parse() method to convert a JSON string into a JavaScript object, and the JSON.stringify() method to convert a JavaScript object into a JSON string. ▪ An example: CS411 Software Engineering 63 Defining Issues Boston University Slideshow Title Goes Here ▪ Of these three, web applications usually use JSON ▪ It's lightweight, easily parsed ▪ APIs typically use JSON ▪ XML is useful when you need to validate the data against a description of what should be there ▪ The DTD (Document Type Definition) provides a way to describe the type, bounds, and behavior (required, etc) of each data item ▪ XSLT (Extensible Stylesheet Language Transformations) is a way to do what looks like ETL of XML, translating from one format to another ▪ We see a lot of XML in banking and other industries where data cleanliness is important CS411 Software Engineering 64 Defining Issues Protecting keys Boston University Slideshow Title Goes Here you use, be sure to take steps to protect your API keys ▪ Whichever authentication method ▪ It isn't a great idea to store keys in the front end ▪ You don't control front-end execution ▪ Since front-end code is Javascript (always), the key must be visible on the front end ▪ Even if you obfuscate the key, it has to be parseable on the front end and is therefor exposed ▪ The right way to protect API keys is to hold them on a back-end server, never exposing them to the front end ▪ Use of a proxy serverless function (like AWS lambda) is popular, keys are secured in the cloud and only executing AWS functions see them ▪ NEVER STORE THEM WITH CODE! NEVER PUSH THEM ANYWHERE! CS411 Software Engineering 65 Defining Issues Boston University Slideshow Title Goes Here ▪ Pushing API keys to Github, Gitlab, or any other public repository is a guaranteed way to have your credentials stolen ▪ Use a .gitignore file (or equivalent) to prevent API keys from being pushed ▪ This means that you need some other way to distribute keys around your dev team CS411 Software Engineering 66 Defining Issues sync vs async Boston University Slideshow Title Goes Here ▪ In Python, we get to make synchronous blocking calls. The script will wait until it gets a result. If it takes 4 seconds, it takes 4 seconds ▪ In JavaScript, we have to make asynchronous non-blocking calls. The webpage has to render and can’t wait for a response. If it takes 4 seconds, you have to be displaying something in the meantime CS411 Software Engineering Our clocks are synchronized down to 3 light-feet light This isn’t on the test. Just OMG!!!! |length – length| < 3 FEET! Promises • Promises in JavaScript are a way to handle asynchronous operations and provide a clean and organized way to write asynchronous code. • A Promise represents the eventual completion (or failure) of an asynchronous operation andnot allows youortohow attach callbacks to be Why, what executed when the operation completes. When it’s done, what do you do then() • The then() method returns a new Promise that can be used to chain additional then() methods or catch() methods. • If the success or error callback returns a value, the new Promise returned by then() will be fulfilled withor that value. Why, not what how • If the success or error callback throws an error or returns a rejected Promise, the new Promise returned by then() will be rejected with that error or reason. • The then() method can be called multiple times on the same Promise, allowing you to chain multiple success and error callbacks together. Requirements Analysis The Only Thing You Need to Get Right CS411 Defining Issues Goals, requirements and non-goals Boston University Slideshow Title Goes Here ▪ Goals: - What problems is it supposed to solve? ▪ Requirements: - Non-functional / Functional / Domain ▪ What is it supposed to achieve? What is it supposed to do? What does it have to do? ▪ Ex: ▪ Non-functional: Secure ▪ Functional: End-to-end encryption ▪ Domain: HIPPA compliant ▪ Non-goals: - What is out of scope for this project? CS411 Software Engineering This is just as important as your goals! Beware scope creep! 72 Defining Issues Requirements elicitation/discovery ▪ What problems are you trying to solve? What do you need it to do? What properties do you need it to have? Boston University Slideshow Title Goes Here ▪ Why? - If you don’t understand why, you can’t be sure you understand what ▪ You should push back How - User: “We’d like for this screen to have a blue background.” - You: “Why?” - User: “User studies demonstrated an increase in conversion rate” - You: “Would you rather have a color picker?” CS411 Software Engineering REQUIREMENTS REPRESENTATION AND DOCUMENTATION • Different audiences require different requirement documentation • Trace your work to all of them This document is the bottom line. If you deliver the requirements and it doesn’t achieve the business goals, you still did everything right Defining Issues Types of requirements University Slideshow Title Goes Here ▪Boston User requirements - Statements in natural language plus diagrams of the services the system provides and its operational constraints. Written for customers. ▪ System requirements - A structured document setting out detailed descriptions of the system’s functions, services and operational constraints. Defines what should be implemented so may be part of a contract between client and contractor. CS411 Software Engineering 75 Defining Issues Use cases and requirements Boston University Slideshow Title Goes Here ▪ The use case is the collecting point for describing how a system will behave in broad areas. ▪ Once use cases have been developed, another pass is made in order to elicit requirements - Example: The use case for account management includes the user changing a password - The requirements would describe how many characters the password must be, what character set, duration, and so on CS411 Software Engineering 76 Defining Issues Non-functional requirements University Slideshow Title Goes Here ▪Boston These define system properties and constraints e.g. reliability, response time and storage requirements. Constraints are I/O device capability, system representations, etc. (speeds and feeds) ▪ Process requirements may also be specified mandating a particular IDE, programming language or development method. ▪ Non-functional requirements may be more critical than functional requirements. If these are not met, the system may be useless. ▪ We often call these the ‘-ilities’ since they describe things like reliability, scalability, and so on CS411 Software Engineering 77 Defining Issues REQUIREMENTS VALIDATION AND VERIFICATION Boston University Slideshow Title Goes Here ▪ Requirements validation: “am I building the right product?” ▪ Requirements verification: “am I building the product right?” Concepts only. You don’t need to know which word is which CS411 Software Engineering Defining Issues Precision in usability requirements Boston University Slideshow Title Goes Here ▪ The system should be easy to use by medical staff and should be organized in such a way that user errors are minimized. (Goal) ▪ Medical staff shall be able to use all the system functions after four hours of training. After this training, the average number of errors made by experienced users shall not exceed two per hour of system use. (Testable non-functional requirement) ▪ The use of the words should and shall here are very precise - The book has a huge list of these words and their precise meaning CS411 Software Engineering 79 Defining Issues Characteristics of good requirements Boston University Slideshow Title Goes Here ▪ Traceable: Each requirement must lead back to a specific business or system need which is owned and documented ▪ Unambiguous: Wording should be specific and clear and avoid jargon and ‘intended’ meaning ▪ Singular: One requirement at a time ▪ Measurable and Testable: It must be possible to develop a test to verify that the requirement has been met ▪ Self-consistent: Requirements should not contradict another CS411 Software Engineering 80 Defining Issues Characteristics of good requirements Boston University Slideshow Title Goes Here ▪ Feasible: Someone must be able to actually meet the requirement; prototyping might be needed in order to show feasibility ▪ Uniquely identified: Each requirement must be identified and attached to a use case ▪ Design agnostic: Requirements gathering and documenting is part of the inception phase…no technology assumptions should be made at this stage ▪ Formal: Use words with specific meaning such as shall, must, may, should, and so on to specify intent CS411 Software Engineering 81 Defining Issues Use cases vs user stories Boston University Slideshow Title Goes Here ▪ A use case is a formal description; it contains a significant amount of information ▪ Most of the time we’ll start with user stories, which are less formal ▪ The user story is a description, with props if needed, of a specific set of interactions with the app from either a user’s or a subsystem’s perspective ▪ Once we have a set of user stories we can flesh them out into more formal use cases ▪ For small projects the user story is often enough; larger, more complex projects go with use cases CS411 Software Engineering 82 Defining Issues University Slideshow Title Goes Here ▪Boston Many agile methodologies incorporate a user story into planning ▪ The story takes the form of “as a <specific actor> I want to <some action> in order to <benefit>” - As a logged-in user I want to change my password to improve the security of my account - As a power-user I want to be able to access all features quickly so I can do what I want with minimal friction - As a new user I want the most important features to be obvious so I can learn to use the software quickly CS411 Software Engineering 83 Defining Issues Boston University Slideshow Title Goes Here ▪ If it starts “as a user I want to…” it’s a non-functional requirement ▪ User stories can be conflicting. Resolving this is a business decision - As a power-user I want to be able to access all features quickly so I can do what I want with minimal friction - As a new user I want the most important features to be obvious so I can learn to use the software quickly ▪ User stories have to have a benefit - “As a security-conscious user I want to have password protection” is a non-functional requirement. - Benefits make money, not features CS411 Software Engineering Defining Issues Boston University Slideshow Title Goes Here ▪ You haven’t been graded on writing requirements so I can’t ask you to write them on the test CS411 Software Engineering Maintenance and Parnas Partitioning and (Re)factoring CS411 Summer 23 ChatGPT (& Peter B. Golbus) Defining Issues Black box modeling Boston University Slideshow Title Goes Here whatever and not your problem Your responsibility is to • Agree on what constitutes valid input and output • Detect invalid input Is my input correct? • Produce valid output that is correct AKA your output? I have no www.eltizon.com.ar way of knowing CS411 Software Engineering Encapsulation of Concerns 1. Improves Maintainability 4. Enables Black Box Design • Code is easier to update when concerns are encapsulated. • Components expose only necessary functionality. • Changes in one part of the system don’t affect unrelated parts. • Developers can assume other parts are correct and interact via well-defined interfaces. 2. Enhances Readability & Understandability • Developers can focus on individual components without needing to understand the entire system at once. • Encourages clear, modular design. 3. Supports Reusability • Well-encapsulated components can be reused across different projects. • Reduces duplication and improves consistency. 5. Reduces Bugs & Increases Reliability • Minimizes unintended side effects. • Encourages controlled access to internal data. Parnas Partitioning • Overview: • Definition: Parnas Partitioning, introduced by David Parnas, is a principle used in software engineering to modularize software systems by dividing them into smaller, manageable, and loosely coupled components. • Goal: The primary goal is to enhance maintainability, flexibility, and comprehensibility of software by minimizing interdependencies among components. • Key Concepts: • Modularity: Breaking down the system into discrete modules that can be developed, tested, and maintained independently. • Information Hiding: Each module hides its internal workings from other modules, exposing only necessary interfaces. • High Cohesion, Low Coupling: Modules should have high internal cohesion and low coupling with other modules to reduce the impact of changes. • (see refactoring section) Partition everywhere! • Benefits: • Improved Maintainability: Easier to update and fix modules without affecting the entire system. • Enhanced Flexibility: Modules can be modified or replaced independently. • Simplified Testing: Independent modules can be tested in isolation, leading to more robust software. • Reusability: Well-defined modules can be reused across different projects. • Applications: • Large-scale Software Systems: Used in complex projects like operating systems and enterprise applications. • Software Product Lines: Facilitates the creation of a family of related products with shared components. • Agile Development: Supports iterative development by allowing incremental changes and improvements. • Everything! Not Hiding Information is Bad!! class AutoIncrementingCounter(Counter): def get_count(self): """Returns the current count but also increments it every time it's called.""" current = self._count self._count += 1 return current # Functions that interacts with Counter def print_counter_wrong(counter: Counter): """Prints the count violating information hiding.""" print(f"Count: {counter._count}") def print_counter_correct(counter: Counter): """Prints the count without violating information hiding.""" print(f"Count: {counter.get_count()}") aic = AutoIncrementingCounter(10) print_counter_correct(aic) print_counter_wrong(aic) print_counter_wrong(aic) print_counter_correct(aic) ... What makes a good split point? • Cohesion: • Split points should result in cohesive components with a clear responsibility or purpose. • Aim for high cohesion within each split component, minimizing dependencies on external modules. • Loose Coupling: • Minimize dependencies between split components to achieve loose coupling. • Encapsulate interactions and dependencies, allowing components to evolve independently. • Testability: • Identify split points that facilitate unit testing and isolation of components. • Split components should be testable in isolation, enabling thorough testing and reducing regression risks. How to partition •Identify System Requirements: • Understand the functional and non-functional requirements of the system. • Determine the critical functionalities and their dependencies. •Decompose the System: • Break down the system into major functional areas. • Use top-down or bottom-up approaches to identify potential modules. •Define Module Boundaries: • Establish clear boundaries for each module. • Ensure each module encapsulates a distinct functionality or responsibility. • Ensure High Cohesion and Low Coupling (see refactoring section): • Design modules to have high internal cohesion, meaning related functionalities are grouped together. • Minimize dependencies between modules to achieve low coupling. How to partition • Apply Information Hiding: • Hide the internal implementation details of each module. • Define and expose only necessary interfaces and interactions. • Create Detailed Module Specifications: • Write detailed specifications for each module, including responsibilities, interfaces, and hidden details. • Specify how modules will interact and communicate with each other. •Iterate and Refine: • Continuously iterate on the design, refining module boundaries and interfaces as needed. • Test modules independently and in combination to ensure they work together seamlessly. How do you work with legacy code? • One piece at a time • Step 1: • Break it into pieces SPLIT Identifying split points Functional Decomposition: • Identify cohesive sections of the codebase based on their functionality. • Look for areas that handle distinct responsibilities or perform specific tasks. Module Extraction: • Identify self-contained modules within the codebase. • Look for sections that can be extracted as separate modules, encapsulating related functionality. Service or API Extraction: • Identify sections that provide specific services or functionality to other parts of the system. • Extract these sections as services or APIs, allowing them to be independently maintained and consumed. Steps in refactoring code 1. Understand and Document the Existing Code: • Gain a thorough understanding of the existing code and its behavior. Analyze and document how it functions, its dependencies, and its impact on other parts of the system. 2. Define the Desired Outcome: • Clearly define the goals and desired outcomes of the refactoring. Determine what improvements you want to achieve, such as better readability, improved performance, or enhanced maintainability. 3. Identify Opportunities: • Identify the specific code section, method, or class that needs refactoring. It could be a piece of code exhibiting a code smell or an area that needs improvement. Steps in refactoring code 4. Identify Test Points and Design Test Suite: • Identify critical areas of code that require thorough testing to ensure proper functionality and maintain desired behavior after refactoring. Determine the key functionality, corner cases, and edge cases that need to be tested to verify the correctness of the refactored code. • Aim for comprehensive test coverage to ensure that the refactored code behaves as expected and doesn't introduce regressions or unintended side effects. Too many Number of imports Not too many and just enough Tightly Coupled Too few Not Cohesive Highly Cohesive and Loosely Coupled Architecture Design CS411 Defining Issues What is a Software Architecture? Boston University Slideshow Title Goes Here ▪ A model, optimized to solve a business problem ▪ The software architecture of a system is ▪ the set of structures ▪ needed to reason about the system, ▪ which comprise software elements, ▪ relations among them, ▪ and properties of both. ▪ The Software Engineering Institute CS411 Software Engineering Software Architecture vs Software Programming Software Architecture Software Programming • interactions among parts • structural properties • system-level performance • Outside module boundary • implementations of parts • computational properties • algorithmic performance • Inside module boundary Chapter 9 Design Concepts • Part Two - Modeling © 2020 McGraw Hill. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or further distribution permitted without the prior written consent of McGraw Hill. Software Engineering Design • Data/Class design – transforms analysis classes into implementation classes and data structures. • Architectural design – defines relationships among the major software structural elements. • Interface design – defines how software elements, hardware elements, and end-users communicate. • Component-level design – transforms structural elements into procedural descriptions of software components. 105 Design Concepts 2 • Information Hiding - controlled interfaces which define and enforces access to component procedural detail and any local data structure used by the component. • Functional independence - single-minded (high cohesion) components with aversion to excessive interaction with other components (low coupling). • Stepwise Refinement – incremental elaboration of detail for all abstractions. • Refactoring—a reorganization technique that simplifies the design without changing functionality. • Design Classes—provide design detail that will enable analysis classes to be implemented. 106 Design Class Characteristics • Complete - includes all necessary attributes and methods) and sufficient (contains only those methods needed to achieve class intent). • Primitiveness – each class method focuses on providing one service. • A function is a verb that does exactly one thing • High cohesion – small, focused, single-minded classes. • Low coupling – class collaboration kept to minimum. 107 Design Modeling Principles 1 • Principle #1. Design should be traceable to the requirements model. • Principle #2. Always consider the architecture of the system to be built. • Principle #3. Design of data is as important as design of processing functions. • Principle #4. Interfaces (both internal and external) must be designed with care. • Principle #5. User interface design should be tuned to the needs of the end-user and stress ease of use. 108 Design Modeling Principles 2 • Principle #6. Component-level design should be functionally independent. • Principle #7. Components should be loosely coupled to each other than the environment. • Principle #8. Design representations (models) should be easily understandable. • Principle #9. The design should be developed iteratively. • Principle #10. Creation of a design model does not preclude using an agile approach. 109 Chapter 10 Architectural Design – A Recommended Approach • Part Two - Modeling © 2020 McGraw Hill. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or further distribution permitted without the prior written consent of McGraw Hill. Architectural Styles • Each style describes a system category that encompasses: 1. set of components (for example: a database, computational modules) that perform a function required by a system. 2. set of connectors that enable “communication, coordination and cooperation” among components. 3. constraints that define how components can be integrated to form the system. 4. semantic models that enable a designer to understand the overall properties of a system by analyzing the known properties of its constituent parts. 111 Data Centered Architecture • Access the text alternative for slide images. 112 Data Flow Architecture • Access the text alternative for slide images. 113 Call Return Architecture We have a whole separate deck for this • Access the text alternative for slide images. 114 Architectural Considerations • Economy – software is uncluttered and relies on abstraction to reduce unnecessary detail. • Visibility – Architectural decisions and their justifications should be obvious to software engineers who review. • Spacing – Separation of concerns in a design without introducing hidden dependencies. • Symmetry – Architectural symmetry implies that a system is consistent and balanced in its attributes. • Emergence – Emergent, self-organized behavior and control are key to creating scalable, efficient, and economic software architectures. 115 Chapter 11 Component-Level Design • Part Two - Modeling © 2020 McGraw Hill. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or further distribution permitted without the prior written consent of McGraw Hill. Component-Level Design Guidelines • Components - Naming conventions should be established for components that are specified as part of the architectural model and then refined and elaborated as part of the component-level model. • Interfaces - provide important information about communication and collaboration (as well as helping us to achieve the OPC). • Dependencies and Inheritance – For readability, it is a good idea to model dependencies from left to right and inheritance from bottom (derived classes) to top (base classes). 117 CBSE Benefits • Reduced lead time. It is faster to build complete applications from a pool of existing components. • Greater return on investment (ROI). Sometimes savings can be realized by purchasing components rather than redeveloping the same functionality in-house. • Leveraged costs of developing components. Reusing components in multiple applications allows the costs to be spread over multiple projects. • Enhanced quality. Components are reused and tested in many different applications. • Maintenance of component-based applications. With careful engineering, it can be relatively easy to replace obsolete components with new or enhanced components. 118 CBSE Risks • Component selection risks. It is difficult to predict component behavior for black-box components, or there may be poor mapping of user requirements to the component architectural design. • Component integration risks. There is a lack of interoperability standards between components; this often requires the creation of “wrapper code” to interface components. • Quality risks. Unknown design assumptions made for the components makes testing more difficult, and this can affect system safety, performance, and reliability. • Security risks. A system can be used in unintended ways, and system vulnerabilities can be caused by integrating components in untested combinations. • System evolution risks. Updated components may be incompatible with user requirements or contain additional undocumented features. 119 Build vs Buy Build Buy Pros Complete integration Opportunity Cost Cons Opportunity cost Dependent on external processes • Feature requests • What if they stop maintaining it? Routine Maintenance and Support? • Depends NEVER BUILD SECURITY (OR RANDOMNESS) Department of Computer Science CS411 Software Architectures @perrydBUCS Defining Issues Architecture and system characteristics ▪ Performance - Localize critical operations and minimize communications. Use large rather than finegrain components. Boston University Slideshow Title Goes Here ▪ Security - Use a layered architecture with critical assets in the inner layers. ▪ Safety - Localize safety-critical features in a small number of sub-systems. ▪ Availability - Include redundant components and mechanisms for fault tolerance. ▪ Maintainability - Use fine-grain, replaceable components. CS411 Software Engineering 30/10/2014 Chapter 6 Architectural Design Defining Issues Data-Centered Architecture What's the pain? Boston University Slideshow Title Goes Here How much money do you have? Thick client Thick client Thick client Data store Thick client Computes are client-side CS411 Software Engineering Thick client Thick client Defining Issues Data-Centered Architectures Client is view Boston University Slideshow Title Goes Here Thick Thin client client Thin client Thin client Thick client Data App Thin client Thick client dumb terminal (green screens) CS411 Software Engineering Thick Thin client client Thick client Thin client Defining Issues Call and Return Architecture Web server Boston University Slideshow Title Goes Here abstraction layers App server XML/XSLT Middleware Fanout Data server Data server Data server Database Database Database BEDS = back-end data sources CS411 Software Engineering Defining Issues Client-Server Architecture Slideshow Title Goes Here ▪Boston EachUniversity component of a client-server system has the role of either client or server (request / response) - Client: a component that makes requests clients are active initiators of transactions - Server: a component that satisfies requests servers are passive and react to client requests CS411 Software Engineering 126 Defining Issues Thick vs Thin client Boston University Slideshow Title Goes Here ▪ Thick client: Work is done on both client and server ▪ Thin client: Work is done on almost exclusively on the server CS411 Software Engineering 127 Defining Issues Tiered Web Architectures Boston University Slideshow Title Goes Here ▪ Web applications are usually implemented with 2-tier, 3-tier, or multitier (N-tier) architectures ▪ Each tier is a platform (client or server) with a unique responsibility CS411 Software Engineering 128 Defining Issues Multitier C-S Architecture University Slideshow Title Goes Here ▪Boston A multitier (N-tier) architecture is an expansion of the 3-tier architecture, in one of several different possible ways - Replication of the function of a tier - Specialization of function within a tier - Portal services, focusing on handling incoming web traffic CS411 Software Engineering 129 Defining Issues Replication Boston University Slideshow Title Goes Here ▪ Application and data servers are replicated (mirror or shard) ▪ Servers share the total workload replication CS411 Software Engineering 130 Defining Issues Specialization Boston University Slideshow Title Goes Here ▪ Servers are specialized ▪ Each server handles a designated part of the workload, by function CS411 Software Engineering 131 Defining Issues Microservices Architecture Boston University Slideshow Title Goes Here ▪ Blends replication with specialization via load balancing Gateway Load Balancer Shopping Cart Service Shopping Cart Service Shopping Cart Service Shopping Cart Service Shopping Cart Service CS411 Software Engineering Core Concepts • Requirements → Business Problems • Layers of Abstraction • Encapsulation of concerns • Docs / logging / testing • Proactive communication