Revision as of 16:56, 22 December 2016 edit Henk.muller (talk \| contribs) 456 edits No edit summary ← Previous edit		Revision as of 17:23, 22 December 2016 edit undo Henk.muller (talk \| contribs) 456 edits More XS2 additions Next edit →
Line 44: The architecture encodes instructions compactly, using 16 bits for frequently used instructions (with up to three operands) and 32 bits for less frequently used instructions (with up to 6 operands). Almost all instructions execute in a single cycle, and the architecture is event-driven in order to decouple the timings that a program needs to make from the execution speed of the program. A program will normally perform its computations and then wait for an [[Event (computing)\|event]] (e.g. a [[Message passing\|message]], time, or external I/O event) before continuing. ==Versions and Devices== There are two versions of the xCORE architecture: XS1 and XS2; XS2 extending the XS1 architecture. Line 51: The XS1 architecture was the first xCORE architecture, defined in 2007. ~~The XS1 instruction set~~It is implemented by the [[XCore XS1-G4]], [[XCore XS1-L1]], [[XCore XS1-SU]], and [[XCore XS1-AnA]]. The former is a four-core processing node, the latter three are single and dual core processing nodes.▼ ===XS2=== The XS2 ~~instruction~~architecture ~~set~~was defined in 2015. It is implemented by the [[xCORE-200]] series processors, which is marketed as the XL2 series (general purpose), XU2 series (USB), XE2 series (RGMII), and versions with embedded flash.▼ The XS2 architecture was defined in 2015, and extends the XS1 architecture with a limited form of [[Dual Issue]] execution.<ref name='xs2'/> The processor core comprises two lanes. The ''Resource lane'' can execute IO operations and general arithmetic. The ''Memory lane'' can execute memory operations, branches, and general arithmetic. Short resource or arithmetic instructions can be executed in the resource lane; short memory, branch, or arithmetic operations can be executed in the memory lane. Long instructions span both lanes.▼ ▲~~The~~ XS2 ~~architecture was defined in 2015, and~~ extends the XS1 architecture with a limited form of [[Dual Issue]] execution.<ref name='xs2'/> The processor core comprises two lanes. The ''Resource lane'' can execute IO operations and general arithmetic. The ''Memory lane'' can execute memory operations, branches, and general arithmetic. Short resource or arithmetic instructions can be executed in the resource lane; short memory, branch, or arithmetic operations can be executed in the memory lane. Long instructions span both lanes. In dual issue mode all pairs of instructions are aligned on a 32-bit boundary. ~~In dual issue mode all pairs of instructions are aligned on a 32-bit boundary.~~ A few instructions have been added to aid in high bandwidth processing, such as dual-word load/store, dual-word zip and unzip (bit and byte strings), dual word arithmetic saturation and shift. ==Architecture== Line 177 ⟶ 180: single instruction; inter-module calling requires at most two instructions. It is up to the callee to save the link-register if it is not a leaf-function, a single instruction extends the stack and saves the link register. Dual issue mode, available on XS2, enables one short load, store, or brancg instruction to be paired with one short resource instruction. Short arithmetic instructions can be paired with any instruction. This enables inner-loops that, for example, transfer data from memory to IO to be halved in length by issuing the LOAD instruction together with the ADD instruction, and the change to the counter together with the branch instruction. ===Parallel programming model=== Line 188 ⟶ 193: A thread can, with a single instruction, synchronise with a group of threads using a barrier synchronisation. Alternatively a thread can synchronise using a lock, providing mutual exclusion. In order to communicate data when using barriers and locks, threads can either write data into the registers of another thread, or they can access memory of another thread (provided both threads execute on the same core). If shared memory is used, then the compiler or the programmer must ensure that there are no race conditions. The XS2 architecture has a 'priority mode' that enables threads to run in high priority. Low priority threads are guaranteed progress, but high priority threads are guaranteed a thread cycle when they are ready to execute. ===I/O and timing instructions=== Line 193 ⟶ 200: Common resources that are supported are ports (for external input and output), timers (that allow timing to a reference clock), channels (that allow communication and synchronization between threads within a core, and threads on different cores), locks (which allow controlled access to shared memory), and synchronizers (which implement barrier synchronizations between threads). ~~==Devices==~~ ▲The XS1 instruction set is implemented by the [[XCore XS1-G4]], [[XCore XS1-L1]], [[XCore XS1-SU]], and [[XCore XS1-AnA]]. The former is a four-core processing node, the latter three are single and dual core processing nodes. ▲The XS2 instruction set is implemented by the [[xCORE-200]] series processors, which is marketed as the XL2 series (general purpose), XU2 series (USB), XE2 series (RGMII), and versions with embedded flash. == References ==

XCore Architecture: Difference between revisions