A programming language is a formal language in which the commands that a computer must perform are written. These languages have different syntax and grammar than natural languages. The latter are too complex and ambiguous to function as a programming language. Code written in a programming language should only be “understood” by the computer in one way.
Thousands of programming languages have emerged over the years and they can be categorized in several ways. A widely used distinction is that of programming paradigm. Some important examples are the imperative, functional and logical programming paradigms, where it should be noted that programming languages sometimes combine multiple paradigms.
There are several ways in which a computer program written by a software developer can be executed by a computer. The code that the developer sees and edits is called the source code of the software; it must somehow be converted into the machine language of the computer in question that can be executed by the processor. Broadly speaking, there are the following options:
- No conversion: the programmer enters machine code directly into the computer’s memory. This is so impractical and time consuming that it hardly ever happens.
- Interpretation: An interpreter piece by piece reads text from the source code, interprets it in terms of meaningful instructions, and immediately executes it, supported by a runtime library. A program that works like this is usually called a script, and the programming language in question is called a scripting language.
- Assembly: An assembly language makes it possible to write programs as sets of instructions and data that can be displayed directly on machine language, but in which you can work in a somewhat more symbolic way, for example because the machine instructions have names, memory addresses can be given names and macros can be used; An assembler converts such code into object code files, which are assembled into an executable program by a linker along with the pre-existing object code from software libraries.
- Compilation: translation of the source code to another language (the target language) by a compiler. The target language can be assembly language; or a machine-independent intermediate language (bytecode, also known as P-code) specially designed for the translation process, which must then be compiled or interpreted; or any other programming language.
There are all kinds of intermediate forms and variants.
A higher (ie compiled or interpreted) programming language is designed to allow the programmer to specify as clearly and elegantly as possible what a program should do in terms of what the programmer thinks about the problem, without detailed knowledge of exactly how it will be executed by the computer : such languages provide high level abstractions and are machine independent. Assembly language programming is done only if specific knowledge about the precise operation of the computer in question is to be used, for example, otherwise the program would use too much space or time.
When compilation is involved, a distinction is often required between actions performed during the editing of the source code (‘at edit time’), during the translation process from source code to target code (‘at compile time’), and during the execution of the target code (‘at run time’). At each of these stages, software can support the programmer, for example, by checking the validity or meaningfulness of certain operations or expressions.
Code optimization often takes place during translation. A simple example: if during translation it appears that an addition or subtraction with 0 occurs in the target code, it can be omitted.
A program that has been translated into target code with a compiler can generally be executed faster – partly through optimization – by the computer than when using an interpreter, because the latter always has to convert the commands to machine language first – the equivalent of the compilation is done in run-time. Many languages, however, work with an intermediate form, in which when a program execution order is given, a compilation to intermediate code and / or target code is still done: Just-In-Time compilation.
The traditional distinction between compiled languages on the one hand and interpreted languages (or ‘script languages’) is therefore not entirely correct. If an interpreted programming language is popular, compilers (JIT or otherwise) are often written to speed up execution; it also happens that an interpreter is written for a hitherto only compiled language, or a translator from one programming language to another.
Programming languages, in the commonly used definition, mean languages that are turing complete. That is, it must be possible to write an interpreter for a turing machine in the programming language, and it must be possible to write an interpreter for the programming language on a turing machine.
In a language that is not turing full, a smaller number of problems can be solved than in a turing full language. For example, in SQL you can calculate totals of tables of data, but you cannot calculate the shortest route between two points in a graph.
It is possible to program computers directly in their own machine language: directly specify the ones and zeros that can be understood by the processor. This was common for the first computers, groups of 8 bits were set with switches. However, it was quickly discovered that it was far too difficult to maintain programs written that way. Therefore, a symbolic way was quickly devised to display the machine instructions as text in the form of mnemonics. This made it easier to read instructions. Such code, which largely corresponds one-to-one to the instruction codes, is called assembly code or also assembler and is written in assembly language. A program that converts this code into machine language is called an assembler.
Programming assembler and machine language requires the programmer to know a lot about the computer he wants to program. To make programming easier, other programming languages, the so-called higher programming languages, were subsequently developed. The higher the order, the further away the language from the machine instructions. For example, an imperative programming language (such as Pascal or C) is closer to machine instructions than a functional programming language (such as Scheme and Haskell). A functional programming language is more in line with human thinking than with the internal workings of the computer. For example, in Haskell it is possible to use ‘normal’ mathematical definitions.
Programming languages are also divided into generations:
- First generation: machine language.
- Second generation: assembler (the bare machine instructions, but put down legibly).
- Third generation: procedural languages such as COBOL, Algol, Pascal, C and Fortran, and later also object-oriented languages such as C ++ and Java.
- Fourth generation: Languages with a higher level of abstraction, developed for a specific purpose, such as SQL and Progress 4GL.
- Fifth Generation: Problem Solving Languages. The programmer does not specify an algorithm, but the problem itself, with a number of associated limitations. Fifth-generation languages are mainly used in the field of artificial intelligence (AI). The best known example is Prolog.
The generations are often abbreviated as GL, for example 3GL, as an abbreviation for 3rd Generation Language (s).
One of the first higher programming languages was Plankalkül, developed in 1946 by the German Konrad Zuse.
The development of programming languages has clear parallels with those of natural languages:
Thousands exist; new languages are constantly being added, while languages are also becoming obsolete. A programming language can become unusable (if there is no compiler or interpreter that works on a working computer) and even get lost (if knowledge of the language is also lost).
Their popularity varies widely: some languages have only ever been used by their own creator, while others are used by millions every day.
Used languages usually develop, with extensions and changes being made in successive versions (for example, VB.NET no longer resembles the BASIC from which it was developed). This development does not have to be linear: one language can be based on another or to a large extent incorporate elements from other languages; a language can also split into different versions, each of which develops further.
To describe programming languages, a meta-language has been devised: BNF or Backus-Naur Form. It only describes the form (syntax) of programs written in the language, not their meaning (semantics).