Epoch parallelism: различия между версиями
Sergey (обсуждение | вклад) (Новая страница: «Classic approach: each command make some job, next command must see real or presented result of previous Copy byte, copy word, add, etc. Parallel approach: each...») |
Sergey (обсуждение | вклад) |
||
Строка 7: | Строка 7: | ||
Multiple reations - one epoch starts many next epoch, or one epoch waits for some, or may be one epoch waits for any of specified epochs. | Multiple reations - one epoch starts many next epoch, or one epoch waits for some, or may be one epoch waits for any of specified epochs. | ||
− | Essentially, classic commands are replaced with epochs. Instructions are marked with epoch it relates. | + | Essentially, classic commands are replaced with epochs. Instructions are marked with epoch it relates. Operations inside an epoch can be reordered and paralleled, dependencies are set as one epoch relation to another. |
''Impressed with memcpy implementations and question how such optimization may be implemented in CPU, not with manual replacing byte loop with word loop. | ''Impressed with memcpy implementations and question how such optimization may be implemented in CPU, not with manual replacing byte loop with word loop. | ||
'' | '' |
Версия 01:45, 22 августа 2023
Classic approach: each command make some job, next command must see real or presented result of previous Copy byte, copy word, add, etc.
Parallel approach: each task is described as some actions, such as make a way for byte from location A to location B. It willbe read, processed and saved. There may be many independent actions that can be grouped by CPU - such as moving many bytes as words. All these actions must not depend on each other so all of than can be processed in any random order or even in parallel. If we need to wait for end of these action, we mark them as related to some Epoch. When we need to start new epoch, we specify on what other epochs it depends so finish of them will be waited before start current. Multiple reations - one epoch starts many next epoch, or one epoch waits for some, or may be one epoch waits for any of specified epochs.
Essentially, classic commands are replaced with epochs. Instructions are marked with epoch it relates. Operations inside an epoch can be reordered and paralleled, dependencies are set as one epoch relation to another.
Impressed with memcpy implementations and question how such optimization may be implemented in CPU, not with manual replacing byte loop with word loop.