t support the usage of higher order functions, this will mean the (31896B)
1 .TL 2 The solution To bad code 3 .AU 4 Lucas Standen 5 .AI 6 7949 7 .AB 8 9 .NH 1 10 Reading this document 11 .LP 12 This document is writen in roff and can be found online at: 13 14 https://github.com/standenboy/school/tree/master/comp/lucas-standen-NEA/writeup 15 16 It is using the ms macro of troff. It can be compiled using the Makefile, 17 or make.sh. A table of 18 contents has been generated using pdftocgen, it is embedded into the pdf, 19 most pdf readers have 20 a button to open it (firefox has it in the top left, in zathura press tab 21 to view it). 22 23 A note on formating of the roff, the text is limited to 80 characters per 24 line and is writen in 25 plain ascii, no utf8 emojis and the like. Code snippets are left in plain 26 text, while full files 27 are converted to a ps file via https://carbon.now.sh/ they should be 150mm ^ 28 2 (as ps is a vector 29 format this wont lower quality, you might need to zoom in though) and then 30 have there source linked 31 above them; assuming they are from a file and not a small example. 32 33 .NH 1 34 Analysis 35 .NH 2 36 The current problem 37 .LP 38 For general small and simple projects, I write in C. However this leads to 39 hours of debugging due to 40 segfaults, and memory leaks. Due to the languages manual memory management 41 the programmer is 42 required to know so much information about the hardware they write for, 43 and the second anything goes 44 wrong, it is vague on how to fix things. 45 46 .B "I need a language that stops me from shooting myself in the foot" 47 48 C has been standard for many decades now and its age is showing, it lacks 49 many modern features like 50 OOP, or higher level functional abstractions, that have become common in 51 modern years due to there 52 helpfulness. This is not to fault C's achievements either, the language is 53 my personal choice for 54 most projects for a reason, it's fast and powerful; any solution I make 55 should not cut that away. 56 57 .NH 2 58 A solution 59 .LP 60 .BI "Zippy LANG" 61 62 A next generation language, for general use. Designed for keeping code simple, 63 neat and readable. 64 It will be similar to functional languages, known for there strict ability 65 to keep code safe and 66 practical. The language should be interpreted like python, perl and lisp, 67 to allow for easy 68 debugging tools. 69 70 The goal of Zippy is to make codding easier, while remaining fast, with a 71 interpreter writen in C. 72 73 .NH 2 74 What is a programming language 75 .NH 3 76 A very simple explanation 77 .LP 78 At its lowest definition a PL is a set of specific words, that when given 79 to a computer in the 80 right order have a reproducible behaviour. A more human way of saying that, 81 would be its how we 82 control computers. 83 .NH 3 84 Why are there so many 85 .LP 86 When someone is looking at code it can often be seen as just that, however 87 there are hundreds of 88 languages that all take the idea of "code" in very different ways. Some are 89 designed for specific 90 hardware, some are designed for making general use programs while others 91 are highly specialized. 92 It is important to see "code", as more than just one overarching term and 93 instead see where the code 94 is being used, and evaluate it from that. 95 96 97 .NH 2 98 Researching, and getting a scope of the project 99 .LP 100 Before I start to design a language i should first find examples of others 101 and find what i want my 102 language to be like. 103 104 I'd like my language to feel modern so i should take inspiration from what 105 other modern languages 106 do, however on the backed i want my language to be stable and fast, for that 107 i should look at 108 older projects. 109 110 .NH 3 111 Examples of older similar projects, that are a good base for my language 112 .NH 4 113 Python 114 .LP 115 Python is a high level OOP language that was designed in 1991. It was made 116 to make programming easy 117 while still being able to use some of C's functions. Although it has become 118 standard for many use 119 cases, it is slow and inefficient, and very bloated. 120 121 https://www.python.org/ 122 123 Zippy should take pythons high level abstractions, as they make programming 124 very easy and it should 125 try and take notes from its libraries as they are mostly well written, 126 and well documented. 127 .NH 4 128 Lisp 129 .LP 130 Lisp is the second ever programming language, developed at MiT, it is the 131 first functional language, 132 creating many common features like higher order functions, recursion, and 133 garbage collection. It is 134 generally not used any more as it feels old compared to other functional 135 languages, like Ocaml or 136 Haskell. 137 138 https://lisp-lang.org/ 139 140 Zippy should try to take alot from the syntax of lisp, () make it easy to 141 see what parts of code 142 will effect what, and make things easy to parse. 143 .NH 4 144 Perl 145 .LP 146 Perl is scripting language designed for use in linux, when bash is too slow, 147 or not suited for the 148 job. Perl is often described as the glue of the universe (see xkcd 149 https://3d.xkcd.com/224/). 150 Its syntax is quite strange however and it is slow. Making it poorly suited 151 towards general use. 152 153 https://www.perl.org/ 154 155 Zippy should take from perls minimalism, it is a small language that is of 156 a similar size to bash 157 or zsh, while feeling closer to python. If Zippy can achieve a similar small 158 size, while remaining 159 powerful I will be happy. 160 161 .NH 3 162 Examples of new similar projects that are also a good base 163 .NH 4 164 Gleam 165 .LP 166 Gleam is a modern language releasing in the past 5 years. It is highly 167 functional, with no mutable 168 data, no traditional loops. Instead recursion can be used to replace alot 169 of these features. 170 Gleam compiles to erlang/Beam bytecode, much like java to the jvm, and doing 171 this has made Gleam 172 a highly scalable language with good library support out the box. 173 174 https://gleam.run/ 175 176 Zippy should take from the functional elements of Gleam, as they keep programs 177 safer, however Zippy 178 should not remove all procedural elements, as for loops are very helpful 179 .NH 4 180 Haskell 181 .LP 182 Haskell is another modern functional language known for being very complicated, 183 however incredibly 184 powerful. Its syntax feels very mathematical, and incredibly terse. 185 186 https://www.haskell.org/ 187 188 Perhaps Zippy could learn from Haskell, as it provides functional and 189 procedural elements, making it 190 a well rounded language 191 .NH 4 192 Hare 193 .LP 194 Hare was designed to be a 100 year language, and thus stability is its main 195 goal, it is not set to 196 have a syntax change any time soon, and it has strong emphasis on memory 197 safety. It fits into the 198 same part of the tech stack as C, and thus it can be used for some very low 199 level work. 200 201 https://harelang.org/ 202 203 I think Zippy should have a strong emphasis on stability, much like Hare, 204 to many times have I segfaulted due to a 205 tiny mistake. Zippy should also look to Hare's small size, you can buy a 206 copy of Hare on a 207 208 .B "SINGLE 3 1/2'' FLOLPY" 209 210 .LP 211 This is something I too should try to achieve. 212 213 .NH 3 214 What should be taken away from these languages? 215 .LP 216 I was already leaning towards functional programming when I started this 217 project however now I 218 believe it's the only option for producing safe applications. Zippy will be 219 a functional language 220 with a strong emphasis on recursion. 221 222 I also believe that I should take size of the interpreter into account, 223 as this is important for 224 keeping the project manageable and consistent. 225 226 And finally I think that syntax should be inspired by Lisp, although Lisp 227 itself can be a messy 228 language, with the right changes I am confident that I can make a attractive 229 language for the 21st 230 century. 231 232 .NH 2 233 Clients 234 .LP 235 In a project of this nature, the Client is every programmer alive; which is 236 a pretty large scope. 237 To narrow this down as much as possible, I will interview a small handful 238 of people throughout the 239 project, of different skill levels. 240 241 .NH 3 242 Client 1, Amy C 243 .LP 244 My first client is a friend of mine, Amy C, she is a confident programmer 245 who has completed many 246 complicated projects. I am choosing her as a client as she can give me 247 technical feed back on my 248 project and its function/utility. 249 .NH 3 250 Client 2, Rayn M 251 .LP 252 Another friend of mine, Rayn M, is a technical computer user, however he 253 does not know how to 254 program at a high level. He will be a good client as he can show me how my 255 language looks to 256 some one who doesn't understand the inside workings, helping me design the 257 structure of the code. 258 .NH 3 259 Client 3, a normie 260 .LP 261 some stuff about how the normie finds the completed project. 262 .NH 3 263 Client 4, myself 264 .LP 265 I've wanted to take out a project like this for a long long time, and this 266 is the perfect 267 opportunity to do so, I will be assessing myself along the way of this, 268 building the project to my 269 personal specification. 270 271 .NH 2 272 Questionnaires 273 .LP 274 It is important to get feedback from end users, so I will take multiple 275 questionnaires throughout 276 the project. I will then use them to slightly edit the requirements of my 277 project this should make 278 the final outcome more helpful and what people want. 279 280 In the section bellow you will find questionnaires from the analyses stage 281 of my project. 282 .NH 3 283 Questionnaire 1 for Amy C 284 285 .BI "[30th April 2024]" 286 .BI "answered by Amy, see pull request she left" 287 .NH 4 288 What do you find the most important in a language? (eg: speed, readability) 289 .LP 290 Speed, readability, debugging ease and disk space efficiency. 291 .NH 4 292 What tools are important for a language to have? (eg: pkg-manager, IDE 293 integration) 294 .LP 295 IDE integration (things like tab complete and debugging tools), a package 296 manager, and the ability 297 to interact with the user through the command line easily. 298 .NH 4 299 What features do you like from other languages (eg: C's advanced memory 300 management, haskell's terse 301 syntax) 302 .LP 303 The ability to pass the memory reference of an object or function and a 304 collection of built-in or 305 standard functions like "print", "split", or "sort". 306 .NH 4 307 What do you want to program in this language (eg: websites, low level systems) 308 .LP 309 Lightweight command line tools and web back ends. 310 .NH 4 311 Do you intend to use graphics in the programs you write? 312 .LP 313 No. 314 .NH 4 315 Would you prefer a language that focuses on ease of use, or power of the code? 316 .LP 317 I like a good balance between the two. 318 .NH 4 319 What were your last 3 projects? (could they have been written in Zippy?) 320 .LP 321 A website, a small command-line tool and a midi keyboard (program runs on 322 a Raspberry Pi Pico). 323 .NH 4 324 How many languages would you use on a single project? (could Zippy be used 325 in your codebase?) 326 .LP 327 I try to use as little languages in a project as possible, so likely not in 328 an existing project. 329 .NH 4 330 Do you care for low level control, or would you prefer high level abstractions? 331 .LP 332 I think low-level control is very important, but high-level abstractions 333 are convenient, so a good 334 balance between the two is best. 335 .NH 4 336 Would you be happy to develop libraries for things that aren't already 337 implemented 338 (eg: an SQL library) 339 .LP 340 Potentially if it is simple enough to implement new things. 341 342 .NH 3 343 Notes from questionnaire 1 344 .LP 345 Some of the key things that I'm taking away from this first questionnaire, 346 are my client/users 347 initial needs and use cases. I think it's clear my language can be of 348 assistance to my client, Zippy 349 will be a good language for web back ends and small command line tools, 350 which my client expressed 351 interested in. 352 353 I find the fact my client is worried by executable size interesting, however 354 I doubt it will be an 355 issue; a ballooning code-base is unlikely as only one person is writing 356 the project. 357 358 I am also taking on the fact that my client wants good command line tools, 359 so a pkg-manager and 360 bundler should be a priority, perhaps they could be written in Zippy after 361 the interpreter is done. 362 363 .NH 2 364 The first elements of the project 365 .LP 366 At this stage I can say that I'm confident in my project and its scope. I 367 have a goal in mind for 368 it. 369 370 .B "The key things to take away from this section are:" 371 372 .B ---- 373 Make a high level language with a useable set of features, to replace C in 374 many situations. 375 376 .B ---- 377 Keep the language readable and easy, with powerful tools available. 378 379 .B ---- 380 Ensure the language is well supported with tools like a pkg-manager. 381 382 .NH 2 383 Moddeling 384 .LP 385 In larger projects, when a programmer needs a data structure that the language 386 they are writing in 387 doesn't provide, they will need to make their own. 388 389 Bellow are a few examples of these data structures that C doesn't already 390 provide. 391 .NH 3 392 Linked lists 393 .LP 394 this is an alternative implementation of a list, where you store some data, 395 and the memory address 396 to the next node. Then you can move through the list by reading the data 397 then reading the data of 398 the next node, and then repeating until the 'next' part of the node is empty. 399 400 A diagram showing this can be seen here: 401 402 .PSPIC linkedlist.ps 403 404 .LP 405 In C this is easy to implement as you can find a memory address very easily 406 with '&' to find where 407 a bit of data is stored. I will need to use a 'struct', which is a bit like 408 a class in C (however 409 you can't attach a function to it). A simple implementation looks like this: 410 411 typedef struct ll { 412 void *data; // the data of the node 413 ll *next; // the next node 414 415 } ll; 416 417 .LP 418 The pro's of a linked list are the fact that they can have data appended to 419 the start or end easily 420 by changing the root node, or the next node. 421 422 Linked lists have a few downsides, for example you can't move through them 423 backwards, and unless you 424 store it on its own, you cant find the length of it in a fast way. 425 426 In my project I would like to use linked list in the AST (see later sections 427 for info), and to store 428 lists in the language. 429 .NH 3 430 Dictionaries 431 .LP 432 A dictionary is a simple data structure that just stores, a bit of data, 433 and a number or string to 434 identify it. 435 A dictionary like a linked list can be implemented with a struct in c like so: 436 437 typedef struct dict { 438 void *data; 439 int id; 440 441 } dict; 442 443 .LP 444 In my project I think I could use a linked list represent a Zippy variable 445 and an ID that i can use 446 to identify it, this could make execution faster as i can compare ID's 447 rather than string values 448 449 .NH 2 450 Prototyping hard features 451 .NH 3 452 Abstract Syntax Trees (AST) theory 453 .LP 454 In a programming language many abstract data types will be used to allow 455 the code to execute, 456 however I think the hardest part of this is an abstract syntax tree. This 457 is a data structure that 458 holds the code in an ordered form that can be analysed and executed in a 459 simple way. It is a tree 460 structure, with the top node being a root and all lower nodes being things 461 needed to calculate the 462 root. It can be used not only for code but also for mathematical expressions. I 463 think the easiest 464 way to show it is via a mathematical example 465 466 Take the follow expression for example: 467 468 .BX "(1 + (10 * (3 - (2 * 4))))" 469 470 We know that this is equal to -49 471 472 However for a computer this is far harder to understand. This is because it 473 has no understanding of 474 order of operation 475 476 To solve this we use an AST (abstract syntax tree) 477 478 When you solve that expression you know to start with (2 * 4), then 3 - 479 the answer to that and so on 480 481 We can represent the steps as a tree like so: 482 483 .PSPIC ast.ps 484 485 .I "[Evalutates to 2 * (2 + 2)]" 486 487 As you can see, you need to evaluate the expression in the most brackets 488 first, then the next, and 489 so on, working you way up 490 491 You can evaluate code in a similar way, treating each operation (such as +-*/) 492 as functions, doing 493 the most deeply nested function first, then working up. Each expression can 494 be represented in this 495 tree, then to show a whole program you can create a list of trees 496 497 .NH 3 498 Implementing AST's 499 .LP 500 As a prototype i will make a program that can take mathematical expressions 501 and evaluate them, and 502 allowing for functions (in the form f(x)). 503 It will do this via AST's 504 505 This prototype takes 173 lines of code, it takes a string as a cmd line 506 argument then converts it 507 into an abstract syntax tree, and finally it executes it. This is just a 508 simple prototype and thus 509 it is small in scope. It can only do simple operators (+-*/) and requires 510 literal values to be 511 surrounded by [] so it knows its not another expression to evaluate. 512 513 https://github.com/standenboy/school/tree/master/comp/lucas-standen-NEA/code/proto/ast 514 515 .PSPIC astg.ps 516 517 .LP 518 Above is the code for the AST, it stores an operation (which is just an 519 integer), and it stores 520 a real left and real right value, along side two other nodes. The real values 521 are integers, this 522 would be the 2 numbers in reference in the expression. The 2 nodes are a 523 recursive data structure, 524 much like putting an object of a class inside the definition of that class 525 itself. They are used to 526 store values that may still be expressions, for example (+ [1] (+ [1] [1])) 527 the second part of this 528 expression would be in the "right" variable. When code is executed I can 529 check if "left", or "right" 530 are null and if they are i know that i am at the lowest expression that is 531 only literal values. 532 Then I can execute that node and work my way up the tree. 533 534 535 The exec function will execute the operation, unless there is a deeper node, 536 if there is a deeper 537 node, then it executes it, and places the result in the right or left spot 538 respectively. 539 540 Expressions are taken as input with the following code, and converted into 541 the AST: 542 543 https://github.com/standenboy/school/tree/master/comp/lucas-standen-NEA/code/proto/ast 544 545 .PSPIC ast.c.ps 546 547 Here is an example input and output: 548 549 ./ast "(+ (- [3] [1]) (- [3] [1]))" 550 551 .BX 4 552 553 Note the [] used to tell the program where the literal values are. 554 555 Overall this was a relatively successful prototype, however it isn't fully 556 functional as a language 557 it has fit the design. 558 559 The rest of the code is the process of converting the string input to literal 560 values and inserting 561 them into the AST 562 563 .NH 3 564 Feedback 565 .LP 566 From my first Client (Amy C), she said that putting the numbers inside square 567 brackets was 568 inconvenient and annoying and it would be better if the numbers were separated 569 by spaces instead of 570 separate square bracket surrounded literals. 571 572 As this is a prototype I won't fix this issue, however in the actual language 573 this is a needed 574 feature that I will be implementing. 575 576 .NH 3 577 Mixing linked lists and AST's 578 .LP 579 Mixing these 2 data structures together you can repressent an entire program. A 580 linked list of 581 AST's is how Zippy will repressent all code the user writes 582 583 Here is an example of this: 584 585 .PSPIC AST+LL.ps 586 587 .LP 588 In this example the linked list is represented by the numbers seen at the top, 589 and the AST's are the 590 tree's moving down. 591 592 As you can see when a value is referenced that is from a different AST the 593 tree will link to another 594 one. This will work the same for function calls, however instead of linking 595 to value definitions it 596 will link to function definitions. 597 598 .NH 2 599 Objectives 600 .NH 3 601 An interpreter for the Zippy language 602 .NH 4 603 Linked list of AST's 604 .LP 605 All of a loaded program should be represented as a linked list of individual 606 AST's, The developer 607 should be able to access the AST for easy hacking. Functions can be represented 608 as a pointer to 609 another part of the list. 610 .NH 4 611 A lisp like syntax 612 .LP 613 This is to ensure the language can be parsed quickly, and is easy to write. 614 .NH 4 615 Functional language 616 .LP 617 This language should lean into the functional programming paradigm, taking 618 inspiration from other 619 functional languages such as lisp, and gleam. 620 .NH 5 621 Recursion 622 .LP 623 Zippy must support recursive algorithms being implemented into it, this will 624 make the AST, have 625 nodes linking back to parent nodes in a linked list. 626 .NH 5 627 Higher order functions 628 .LP 629 Zippy must support the usage of higher order functions, this will mean the 630 AST needs to have an 631 unlimited depth as otherwise the limit would be quickly reached, it can't 632 be hard-coded, it must be 633 dynamic. 634 .NH 4 635 Performance 636 .LP 637 The interpreter must be fast and memory efficient, the language is designed 638 to work as an 639 alternative to C, one of the fastest languages of all time, the interpreter 640 must be fast, however 641 memory footprint is not as much of a requirement. 642 .NH 4 643 Safe 644 .LP 645 Code that the user writes must be safe, and not prone to errors. This can 646 be handeled via the strong 647 syntax checker and type safety. 648 649 .NH 3 650 Standard library for Zippy 651 .NH 4 652 io 653 .LP 654 The language must have a simple to use I/O library to make outputs easy. 655 .NH 4 656 string 657 .LP 658 The language should have a sting library that provides a string type, and 659 many complex algorithms 660 that can be applied to them (concatenation, insertion, appending, splitting, 661 stripping). 662 .NH 4 663 sorts 664 .LP 665 The language should have a sorting library that provides algorithms used 666 for sorting (like merge 667 sort). 668 .NH 4 669 graphs 670 .LP 671 the language must have a graph library, that allows for easy creation and 672 working with graphs, it 673 should provide many algorithms to help traverse these graphs 674 675 .NH 3 676 Tooling for the Zippy language 677 .NH 4 678 zpypkg 679 .LP 680 Zippy must provide a package manager, that allows code to be shared between 681 multiple users, easily. 682 It should sync projects via git and allow them to be stored on any git host 683 the user likes. 684 .NH 4 685 Syntax checker 686 .LP 687 Zippy shouldn't have a built in syntax checker, instead it should be something 688 that can be run 689 independently of the interpreter, this means that a lot of the checking that 690 interpreted languages 691 do, can be done once by the developer, before shipping the app, as opposed 692 to every time the program 693 is run, which brings down performance. 694 .NH 3 695 Integration with C, via a C API 696 .NH 4 697 C API 698 .LP 699 You should be able to execute a string of Zippy code in C using a library 700 that is linked with 701 interpreter. This could allow Zippy to be used as a configuration language 702 like Lua. 703 704 .NH 2 705 Desirable features 706 .LP 707 If time allows I'd like to add some of the following features to flesh out 708 the language: 709 .NH 3 710 Raylib support 711 .LP 712 Raylib is a powerful game engine for C, however it has been ported to most 713 languages under the 714 sun due to how simple it is. If I have time, porting Raylib to Zippy would 715 make the language 716 far more useable, as it can be use for graphics programming. 717 718 https://www.Raylib.com/ 719 720 .NH 3 721 Vim integration. 722 .LP 723 Zippy should have integration with the Vim editor for syntax highlighting, 724 this can be done via 725 generating a linked list of AST's then colouring function calls a specific 726 colour, and variables 727 another, etc, etc. 728 .NH 3 729 LSP 730 .LP 731 A LSP (language server protocol), is used in code IDE's to auto complete 732 code for you, I'd 733 like one for Zippy. Although I am unsure as to how to tackle this. I believe 734 a program called 735 treesitter can be helpful for this. 736 .NH 3 737 Networking sockets 738 .LP 739 If possible I'd also like to provide bindings for unix network sockets, 740 however this would be 741 very difficult, as I would need to allow Zippy stucts to be directly converted 742 to C stucts, 743 when executing ELF symbols (Parts of an execuable file). 744 745 .NH 1 746 Design 747 .NH 2 748 Language specification 749 .LP 750 Like any other programming language Zippy needs to have a defined syntax, 751 as mentioned in the 752 objectives section of Analysis, I want the language to follow a lisp like 753 syntax. 754 755 I also believe higher order functions should be taken as standard and many 756 core functions will use 757 them. 758 759 .NH 3 760 Data types 761 .NH 4 762 Basic types 763 .LP 764 i32 - signed integer of size 32 bits 765 766 u32 - unsigned integer of size 32 bits 767 768 i64 - signed integer of size 64 bits 769 770 u64 - unsigned integer of size 64 bits 771 772 char - single ascii code 773 774 float - standard C float 775 776 .NH 4 777 Advanced types 778 .LP 779 function - a function that can be used 780 781 generic - should be avoided, removes checks for data types when inputting 782 values to functions 783 will cause many runtime errors, however when absolutely needed it is useful. 784 785 .NH 4 786 Arrays 787 .LP 788 Arrays can be show like so: 789 790 x:type[] 791 792 With x being the variable name, type being the type of variable, and [] 793 showing its an array 794 795 All arrays are dynamic, represented by a linked list on the back end. 796 .NH 5 797 Strings 798 .LP 799 Strings, like in C are arrays of chars 800 801 .NH 3 802 Built in functions 803 .NH 4 804 defun 805 .LP 806 (defun a:type b:type returntype 807 ... 808 ... 809 810 ) 811 812 Returns a function that take A and B as an argument (fixed types), and 813 returns a value of 814 returntype. 815 816 .NH 4 817 let 818 .LP 819 (let x:type value) 820 821 Creates constant x of type type to value. 822 823 .NH 4 824 set 825 .LP 826 (set x:type value) 827 828 Creates/recreates the variable value of x to value. 829 830 .NH 4 831 if/elif/else 832 .LP 833 (if condition function) 834 835 (elif condition function) 836 837 (else function) 838 839 840 Executes the function provided if the condition is true. 841 842 Elif works the same, except only if the previous if statement is false. 843 844 Else executes only if all previous statements were false. 845 846 .NH 4 847 for 848 .LP 849 (for i (condition) function) 850 851 Runs the function while the condition is true, and increments i every time 852 the function 853 is called. 854 855 .NH 4 856 while 857 .LP 858 (while condition function) 859 860 Runs the function if the condition is true, keeps running until it is false. 861 862 .NH 4 863 symbol 864 .LP 865 (symbol a:type b:type c:type returntype name:char[] elf:char[]) 866 867 Returns a function that takes arguments A, B, C (of fixed types), the name 868 of the function, 869 and the file path of the elf. 870 .NH 5 871 872 .NH 4 873 Arithmetic operations 874 .LP 875 Simple operations 876 877 (+ a b) returns a + b 878 879 (- a b) returns a - b 880 881 (* a b) returns a * b 882 883 (/ a b) returns a / b 884 885 .NH 4 886 Comparison 887 .LP 888 All return true or false 889 890 (= a b) returns if a = b 891 892 (!= a b) returns if a != b 893 894 (> a b) returns if a > b 895 896 (< a b) returns if a < b 897 898 (=> a b) returns if a => b 899 900 (=< a b) returns if a =< b 901 902 .NH 4 903 cast 904 .LP 905 (cast a:generic type:char[]) 906 907 returns a but cast to data type type, which is a string. 908 909 .NH 4 910 typeof 911 .LP 912 (typeof a:generic) 913 914 returns in a string the type that variable A is. 915 916 .NH 4 917 terminate 918 .LP 919 (terminate error:error) 920 921 Kills the program at the current point, frees all related memory, prints 922 error info stored in error. 923 924 .NH 4 925 return 926 .LP 927 (return a:type) 928 929 Must be used in defun, returns "a" from the function, "a" must be of the 930 functions return type. 931 932 .NH 3 933 List of keywords 934 .LP 935 defun 936 937 for 938 939 while 940 941 if 942 943 elif 944 945 else 946 947 exit 948 949 return 950 951 symbol 952 953 set 954 955 let 956 957 .NH 2 958 Memory management 959 .LP 960 Memory will be allocated when a variable is initialized, and freed when the 961 program stops. 962 Although this isn't the fastest method, it is simple and has less runtime 963 overhead. 964 965 .NH 2 966 Questionnaire 2 for Rayn M 967 .NH 3 968 How do you find this layout of the language? 969 .LP 970 .I "(5-6 points)" 971 - I like the immutable nature of the language 972 - I like the simplicity 973 - I like the low level performance this will have 974 - I dislike the word terminate 975 - I like the procedural approach, with the function robustness 976 - I dislike the brackets! 977 .NH 3 978 Response 979 .LP 980 Although he does dislike some of my features I believe them to be core parts 981 of the language so 982 I will keep them. I will also keep his points in mind though, I don't want 983 to discourage learning 984 the language due to its abstract syntax. 985 986 However as per his request I will change the terminate keyword to the more 987 normal exit. 988 989 An updated keyword list is as flows: 990 991 defun 992 993 for 994 995 while 996 997 if 998 999 elif 1000 1001 else 1002 1003 exit 1004 1005 return 1006 1007 symbol 1008 1009 set 1010 1011 let 1012 1013 .NH 2 1014 What language do you use to make a programming language 1015 .LP 1016 As mentioned before Zippy will be written in C, with some parts being written 1017 in Zippy itself. 1018 I will try and keep most dependencies/libraries to a minimal to make the 1019 project easier to manage. 1020 1021 .NH 3 1022 What is C? 1023 .LP 1024 C was made by Dennis Ritchie, in 1972 at AT&T's bell labs. It was designed 1025 to make programming low 1026 level systems far easier than it had been before. It was used to create the 1027 unix operating system 1028 which would go on to inspire most modern operating systems in some way. (macos 1029 still has code from 1030 the original release of C+unix). 1031 1032 The language quickly caught on outside of bell labs after more available 1033 releases of unix arrived 1034 such as bsd 4.4, sun os and GNU. It was found to be able to do all the things 1035 that you could do in 1036 ASM however with far less a headache. 1037 1038 .NH 3 1039 Why is C? 1040 .LP 1041 As mentioned C can do anything that ASM can do, meaning it is lightning fast 1042 and can take advantage 1043 of direct memory access. This allows you to make very fast lightweight 1044 executables that can rival 1045 the performance of handwritten ASM (often beating it if you enable compiler 1046 optimisations). It is 1047 this that makes C the perfect language for any and all programming languages, 1048 where speed is key, 1049 and allfeatures need to be available are present. 1050 1051 .NH 3 1052 How is C? 1053 .LP 1054 C is compiled to ASM, the main compilers available are clang, gcc and MSVC, 1055 I will be using gcc 1056 as it is generally standard in linux environments. 1057 1058 Many build systems are available for C, the main ones being cmake and gnu 1059 make. Both of them have 1060 the goal of putting the compiling process in one command. Cmake is cross 1061 platform (sorta windows 1062 doesn't work well but it does work). 1063 1064 .NH 3 1065 Libraries 1066 .LP 1067 The libraries I will use are the following: 1068 1069 C stdlib 1070 1071 C unistd 1072 1073 C errno 1074 1075 Unix device files 1076 1077 Zippy strings 1078 1079 Zippy graphs 1080 1081 Zippy sorts 1082 1083 Addition libraries (may not be implemented): 1084 1085 Raylib 1086 1087 C sockets + Zippy sockets 1088 1089 .NH 3 1090 Modularization 1091 .LP 1092 To make the project more manageable I will split it into many C files, 1093 this is to keep it from 1094 becoming impossible to edit code. 1095 1096 The file layout looks as follows: 1097 1098 PLACE HERE 1099 1100 As you can see this is split up over around 40 files and 16 folders, each 1101 file should not go over 1102 ~500 lines of code. This is to keep everything as easy to manage as possible. 1103 1104 This level of modularization in needed for the development of Zippy as 1105 without it, files will become 1106 a mess that can't be worked with. 1107 1108 All .c files will be compiled into .o files, then the .o files can be linked 1109 with the final zpy.c 1110 to generate the final executable. 1111 1112 1113 .NH 4 1114 Build system 1115 .LP 1116 The entire project is being build with GNU make files, each folder that 1117 builds something will have 1118 its own makefile. This will mean the entire project can be compiled with a 1119 single make in the root 1120 folder of the project. 1121 1122 Example of make: 1123 1124 make -j2 1125 1126 This will build all files specified by 'Makefile' with 2 threads. 1127 1128 The project should be build with gcc, and ld. It should be build with the 1129 -O3 build flag to ensure 1130 the program runs as fast as possible. -O3 forces the compiler to build with 1131 optimizations. 1132 1133 When the project is finished, I will try compiling with clang and tcc, 1134 to compare performance. 1135 1136 .NH 2 1137 Time table 1138 .LP 1139 The first step is to tackle the interpreter, so the zpy.c file needs to be 1140 finished. The tokenizer, 1141 execution, and libs folders need to be finished, after this point you should 1142 be able to execute 1143 Zippy code however not syntax check it or get error handling. 1144 1145 The next step is zpycheck, the syntax and error handler, this should be ran 1146 before code is shipped 1147 to the user. It can reuse a lot of code from the tokenizer and execution steps. 1148 1149 Finally I need to make zpypkg, this should be easy as most of it can be 1150 written in Zippy, and a few 1151 bits can be written in bash. It should be a good test to how Zippy can 1152 be written. 1153 1154 If time allows it is at this point that I will write a Raylib library and 1155 a unix/C sockets library. 1156 1157 .NH 2 1158 Flow through the system 1159 .LP 1160 The alogrithum to run code is quite complex however it can be boiled down 1161 to a few simple steps: 1162 1163 .B "read the text file (strip line breaks and tabs)" 1164 .LP 1165 .B "create an empty linked list" 1166 .LP 1167 .B "get the first expression from the text file (with be encapsulated with 1168 "()"" 1169 .B "get the function call and its args into a token" 1170 .LP 1171 .B "if the arguments of the function are there own function call, then 1172 convert them into a token" 1173 .LP 1174 .B "set that token as the argument in the first token" 1175 .LP 1176 .B "append the root token to the linked list" 1177 .LP 1178 .B "repeat until the text file string is empty" 1179 .LP 1180 .B "allocate memory for the program and prepare the exection step" 1181 .LP 1182 .B "at the start of the linked list traverse to the bottem of the tree 1183 (made of tokens)" 1184 .LP 1185 .B "execute the lowest token" 1186 .LP 1187 .B "repeat until all tokens including the root have been executed" 1188 .LP 1189 .B "move to the next node of the linked list" 1190 .LP 1191 .B "repeat until the linked list is empty" 1192 1193 .LP 1194 Within each of these steps is many smaller steps. The hardest part will be 1195 making the tokens, as 1196 this requires alot of string manipultation. The execution will be a recursive 1197 alogrithum. All trees 1198 will be represented via structs (see section on AST's). 1199 1200 PUT SOME FLOW CHARTS HERE 1201 1202 .NH 1 1203 Technical Solution 1204 .NH 1 1205 Testing 1206 .NH 1 1207 Evaluation 1208 .AE