3
There is 3 types of addresses: virtual, physical, and bus. For DMA a bus
4
address is used. However, on x86 physical and bus addresses are the same (on
5
other architectures it is not guaranteed). Anyway, this assumption is still
6
used by xdma driver, it uses phiscal address for DMA access. I have ported
7
in the same way. Now, we need to provide additionaly bus-addresses in kmem
8
abstraction and use it in NWL DMA implementation.
10
DMA Access Synchronization
11
==========================
12
- At driver level, few types of buffers are supported:
13
* SIMPLE - non-reusable buffers, the use infomation can be used for cleanup
14
after crashed applications.
15
* EXCLUSIVE - reusable buffers which can be mmaped by a single appliction
16
only. There is two modes of these buffers:
17
+ Buffers in a STANDARD mode are created for a single DMA operation and
18
if such buffer is detected while trying to reuse, the last operation
19
has failed and reset is needed.
20
+ Buffers in a PERSISTENT mode are preserved between invocations of
21
control application and cleaned up only after the PERSISTENT flag is
23
* SHARED - reusable buffers shared by multiple processes. Not really
26
KMEM_FLAG_HW - indicates that buffer can be used by hardware, acually this
27
means that DMA will be enabled afterwards. The driver is not able to check
28
if it really was enable and therefore will block any attempt to release
29
buffer until KMEM_HW_FLAG is passed to kmem_free routine as well. The later
30
should only called with KMEM_HW_FLAG after the DMA engine is stopped. Then,
31
the driver can be realesd by kmem_free if ref count reaches 0.
33
KMEM_FLAG_EXCLUSIVE - prevents multiple processes mmaping the buffer
34
simultaneously. This is used to prevent multiple processes use the same
35
DMA engine at the same time. When passed to kmem_free, allows to clean
36
buffers with lost clients even for shared buffers.
38
KMEM_FLAG_REUSE - requires reuse of existing buffer. If reusable buffer is
39
found (non-reusable buffers, i.e. allocated without KMEM_FLAG_REUSE are
40
ignored), it is returned instead of allocation. Three types of usage
41
counters are used. At moment of allocation, the HW reference is set if
42
neccessary. The usage counter is increased by kmem_alloc function and
43
decreased by kmem_free. Finally, the reference is obtained at returned
44
during mmap/munmap. So, on kmem_free, we do not clean
45
a) buffers with reference count above zero or hardware reference set.
46
REUSE flag should be supplied, overwise the error is returned
47
b) PERSISTENT buffer. REUSE flash should be supplied, overwise the
49
c) non-exclusive buffers with usage counter above zero (For exclusive
50
buffer the value of usage counter above zero just means that application
51
have failed without cleaning buffers first. There is no easy way to
52
detect that for shared buffers, so it is left as manual operation in
54
d) any buffer if KMEM_FLAG_REUSE was provided to function
55
During module unload, only buffers with references can prevent cleanup. In
56
this case the only possiblity to free the driver is to call kmem_free
59
KMEM_FLAG_PERSISTENT - if passed to allocation routine, changes mode of
60
buffer to PERSISTENT, if passed to free routine, vice-versa changes mode
61
of buffer to NORMAL. Basically, if we call 'pci --dma-start' this flag
62
should be passed to alloc and if we call 'pci --dma-stop' it should be
63
passed to free. In other case, the flag should not be present.
65
If application crashed, the munmap while be still called cleaning software
66
references. However, the hardware reference will stay since it is not clear
67
if hardware channel was closed or not. To lift hardware reference, the
68
application can be re-executed (or dma_stop called, for instance).
69
* If there is no hardware reference, the buffers will be reused by next
70
call to application and for EXCLUSIVE buffer cleaned at the end. For SHARED
71
buffers they will be cleaned during module cleanup only (no active
73
* The buffer will be reused by next call which can result in wrong behaviour
74
if buffer left in incoherent stage. This should be handled on upper level.
76
- At pcilib/kmem level synchronization of multiple buffers is performed
77
* The HW reference and following modes should be consistent between member
78
parts: REUSABLE, PERSISTENT, EXCLUSIVE (only HW reference and PERSISTENT
79
mode should be checked, others are handled on dirver level)
80
* It is fine if only part of buffers are reused and others are newly
81
allocated. However, on higher level this can be checked and resulting
84
Treatment of inconsistencies:
85
* Buffers are in PRESISTENT mode, but newly allocated, OK
86
* Buffers are reused, but are not in PERSISTENT mode (for EXCLUSIVE buffers
87
this means that application has crashed during the last execution), OK
88
* Some of buffers are reused (not just REUSABLE, but actually reused),
89
others - not, OK until
90
a) either PERSISTENT flag is set or reused buffers are non-PERSISTENT
91
b) either HW flag is set or reused buffers does not hold HW reference
92
* PERSISTENT mode inconsistency, FAIL (even if we are going to set
93
PERSISTENT mode anyway)
94
* HW reference inconsistency, FAIL (even if we are going to set
97
On allocation error at some of the buffer, call clean routine and
98
* Preserve PERSISTENT mode and HW reference if buffers held them before
99
unsuccessful kmem initialization. Until the last failed block, the blocks
100
of kmem should be consistent. The HW/PERSISTENT flags should be removed
101
if all reused blocks were in HW/PERSISTENT mode. The last block needs
102
special treatment. The flags may be removed for the block if it was
103
HW/PERSISTENT state (and others not).
104
* Remove REUSE flag, we want to clean if allowed by current buffer status
105
* EXCLUSIVE flag is not important for kmem_free routine.
108
There is 4 components of DMA access:
109
* DMA engine enabled/disabled
110
* DMA engine IRQs enabled/disabled - always enabled at startup
112
* Ring start/stop pointers
114
To prevent multiple processes accessing DMA engine in parallel, the first
115
action is buffer initialization which will fail if buffers already used
116
* Always with REUSE, EXCLUSIVE, and HW flags
117
* Optionally with PERSISTENT flag (if DMA_PERSISTENT flag is set)
118
If another DMA app is running, the buffer allocation will fail (no dma_stop
119
is executed in this case)
121
Depending on PRESERVE flag, kmem_free will be called with REUSE flag
122
keeping buffer in memory (this is redundant since HW flag is enough) or HW
123
flag indicating that DMA engine is stopped and buffer could be cleaned.
124
PERSISTENT flag is defined by DMA_PERSISTENT flag passed to stop routine.
126
PRESERVE flag is enforced if DMA_PERSISTENT is not passed to dma_stop
127
routine and either it:
128
a) Explicitely set by DMA_PERMANENT flag passed to dma_start
130
b) Implicitely set if DMA engine is already enabled during dma_start,
131
all buffers are reused, and are in persistent mode.
132
If PRESERVE flag is on, the engine will not be stopped at the end of
133
execution (and buffers will stay because of HW flag).
135
If buffers are reused and are already in PERSISTENT mode, DMA engine was on
136
before dma_start (PRESERVE flag is ignored, because it can be enforced),
137
ring pointers are calculated from LAST_BD and states of ring elements.
138
If previous application crashed (i.e. buffers may be corrupted). Two
140
* If during the call buffers were in non-PERSISTENT mode, it can be
141
easily detected - buffers are reused, but are not in PERSISTENT mode
142
(or at least was not before we set them to). In this case we just
143
reinitialize all buffers.
144
* If during the call buffers were in PERSISTENT mode, it is up to
145
user to check their consistency and restart DMA engine.]
147
IRQs are enabled and disabled at each call
151
standard: default reading mode, reads a single full packet
152
multipacket: reads all available packets
153
waiting multipacket: reads all available packets, after finishing the
154
last one waiting if new data arrives
155
exact read: read exactly specified number of bytes (should be
156
only supported if it is multiple of packets, otherwise
157
error should be returned)
158
ignore packets: autoterminate each buffer, depends on engine
161
To handle differnt cases, the value returned by callback function instructs
162
the DMA library how long to wait for the next data to appear before timing
163
out. The following variants are possible:
164
terminate: just bail out
165
check: no timeout, just check if there is data, otherwise
167
timeout: standard DMA timeout, normaly used while receiving
168
fragments of packet: in this case it is expected
169
that device has already prepared data and only
170
the performance of DMA engine limits transfer speed
171
wait: wait until the data is prepared by the device, this
172
timeout is specified as argument to the dma_stream
173
function (standard DMA timeout is used by default)
175
first | new_pkt | bufer
176
--------------------------
177
standard wait | term | timeout
178
multiple packets wait | check | timeout - DMA_READ_FLAG_MULTIPACKET
179
waiting multipacket wait | wait | timeout - DMA_READ_FLAG_WAIT
180
exact wait | wait/term | timeout - limited by size parameter
181
ignore packets wait | wait/check| wait/check - just autoterminated
183
Shall we do a special handling in case of overflow?
188
The DMA addresses are limited to 32 bits (~4GB for everything). This means we
189
can't really use DMA pages are sole buffers. Therefore, a second thread, with
190
a realtime scheduling policy if possible, will be spawned and will copy the
191
data from the DMA pages into the allocated buffers. On expiration of duration
192
or number of events set by autostop call, this thread will be stopped but
193
processing in streaming mode will continue until all copyied data is passed
196
To avoid stalls, the IPECamera requires data to be read continuously read out.
197
For this reason, there is no locks in the readout thread. It will simplify
198
overwrite the old frames if data is not copied out timely. To handle this case
199
after getting the data and processing it, the calling application should use
200
return_data function and check return code. This function may return error
201
indicating that the data was overwritten meanwhile. Hence, the data is
202
corrupted and shoud be droped by the application. The copy_data function
203
performs this check and user application can be sure it get coherent data
206
There is a way to avoid this problem. For raw data, the rawdata callback
207
can be requested. This callback blocks execution of readout thread and
208
data may be treated safely by calling application. However, this may
209
cause problems to electronics. Therefore, only memcpy should be performed
210
on the data normally.
212
The reconstructed data, however, may be safely accessed. As described above,
213
the raw data will be continuously overwritten by the reader thread. However,
214
reconstructed data, upon the get_data call, will be protected by the mutex.
217
Register Access Synchronization
218
===============================
219
We need to serialize access to the registers by the different running
220
applications and handle case when registers are accessed indirectly by
221
writting PCI BARs (DMA implementations, for instance).
223
- Module-assisted locking:
224
* During initialization the locking context is created (which is basicaly
225
a kmem_handle of type LOCK_PAGE.
226
* This locking context is passed to the kernel module along with lock type
227
(LOCK_BANK) and lock item (BANK ADDRESS). If lock context is already owns
228
lock on the specified bank, just reference number is increased, otherwise
229
we are trying to obtain new lock.
230
* Kernel module just iterates over all registered lock pages and checks if
231
any holds the specified lock. if not, the lock is obtained and registered
232
in the our lock page.
233
* This allows to share access between multiple threads of single application
234
(by using the same lock page) or protect (by using own lock pages by each of
236
* Either on application cleanup or if application crashed, the memory mapping
237
of lock page is removed and, hence, locks are freed.
239
- Multiple-ways of accessing registers
240
Because of reference counting, we can successfully obtain locks multiple
241
times if necessary. The following locks are protecting register access:
242
a) Global register_read/write lock bank before executing implementation
243
b) DMA bank is locked by global DMA functions. So we can access the
244
registers using plain PCI bar read/write.
245
c) Sequence of register operations can be protected with pcilib_lock_bank
247
Reading raw register space or PCI bank is not locked.
248
* Ok. We can detect banks which will be affected by PCI read/write and
249
lock them. But shall we do it?
251
Register/DMA Configuration
252
==========================
253
- XML description of registers
254
- Formal XML-based (or non XML-based) language for DMA implementation.
255
a) Writting/Reading register values
256
b) Wait until <register1>=<value> on <register2>=<value> report error
261
IRQ types: DMA IRQ, Event IRQ, other types
262
IRQ hardware source: To allow purely user-space implementation, as general
263
rule, only a single (standard) source should be used.
264
IRQ source: The dma/event engines, however, may detail this hardware source
265
and produce real IRQ source basing on the values of registers. For example,
266
for DMA IRQs the source may present engine number and for Event IRQs the
267
source may present event type.
269
Only types can be enabled or disabled. The sources are enabled/disabled
270
by enabling/disabling correspondent DMA engines or Event types. The expected
271
workflow is following:
272
* We enabling IRQs in user-space (normally setting some registers). Normally,
273
just an Event IRQs, the DMA if necessary will be managed by DMA engine itself.
274
* We waiting for standard IRQ from hardware (driver)
275
* In the user space, we are checking registers to find out the real source
276
of IRQ (driver reports us just hardware source), generating appropriate
277
events, and acknowledge IRQ. This is dependent on implementation and should
278
be managed inside event API.
280
I.e. the driver implements just two methods pcilib_wait_irq(hw_source),
281
pcilib_clear_irq(hw_source). Only a few hardware IRQ sources are defined.
282
In most cirstumances, the IRQ_SOURCE_DEFAULT is used.
284
The DMA engine may provide 3 additional methods, to enable, disable,
287
... To be decided in details upon the need...
291
- JTag should be connected to USB connector on the board (next to Ethernet)
292
- The computer should be tourned off and on before programming
293
- The environment variable should be loaded
295
- The application is called 'impact'
296
No project is needed, cancel initial proposals (No/Cancel)
297
Double-click on "Boundary Scan"
298
Right click in the right window and select "Init Chain"
299
We don't want to select bit file now (Yes and, then, click Cancel)
300
Right click on second (right) item and choose "Assign new CF file"
301
Select a bit file. Answer No, we don't want to attach SPI to SPI Prom
302
Select xv6vlx240t and program it
303
- Shutdown and start computer
306
v.2: /home/uros/Repo/UFO2_last_good_version_UFO2.bit
307
v.3: /home/uros/Repo/UFO3
308
Step5 - best working revision
309
Step6 - last revision
b'\\ No newline at end of file'