qca99x0 and qca4019 solutions limit probe responses transmissions.
Logging warning message for each probe response drop is flooding
kernel log unnecessary with " failed to increase tx mgmt pending
count: -16, dropping". Hence reducing log level to debug.
Reported-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
pre-calibration is meant for qca4019 which contains only caldata
whereas calibration file is used by ar9888 and qca99x0 that contains
both board data and caldata. So by definition both pre-cal-file and
cal-file can not coexist. Keeping them in shared memory (union), is
breaking boot sequence of qca99x0. Fix it by storing both binaries
in separate memories. This issue is reported in ipq8064 platform which
includes caldata in flash memory.
Fixes: 3d9195ea19e4 ("ath10k: incorporate qca4019 cal data download sequence")
Reported-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
When HW crypto is used, there's no need for the CCMP/GCMP MIC to
be available to mac80211, and the hardware might have removed it
already after checking. The MIC is also useless to have when the
frame is already decrypted, so allow indicating that it's not
present.
Since we are running out of bits in mac80211_rx_flags, make
the flags field a u64.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Check and parse Rx MAC timestamp when firmware sets its flag
to status variable.
10.4 firmware adds it in Rx beacon frame only at this moment.
Drivers and mac80211 may utilize it to detect such clockdrift
or beacon collision and use the result for beacon collision
avoidance.
Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Since tx completion and rx indication processing are moved out
of txrx tasklet and rx ring lock contention also removed from txrx
for rx_ind messages, it would be efficient to combine both replenish
and txrx tasks. Refill threshold is adjusted for both AP135 and AP148
(low and high end systems). With this adjustment in AP135, TCP DL is
improved from 603 Mbps to 620 Mbps and UDP DL is improved from 758 Mbps
to 803 Mbps. Also no watchdog are observed on UDP BiDi.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Whenever htt rx indication i.e target to host messages are received
on rx copy engine (CE5), the message will be freed after processing
the response. Then CE 5 will be refilled with new descriptors at
post rx processing. This memory alloc and free operations can be avoided
by reusing the same descriptors.
During CE pipe allocation, full ring is not initialized i.e n-1 entries
are filled up. So for CE 5 full ring should be filled up to reuse
descriptors. Moreover CE 5 write index will be updated in single shot
instead of incremental access. This could avoid multiple pci_write and
ce_ring access. From experiments, It improves CPU usage by ~3% in IPQ4019
platform.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The physical address necessary to unmap DMA ('bufferp') is stored
in ath10k_skb_cb as 'paddr'. For diag register read and write
operations, 'paddr' is stored in transfer context. ath10k doesn't rely
on the meta/transfer_id. So the unused output arguments {bufferp, nbytesp
and transfer_idp} are removed from CE recv_next completion.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Except qca61x4 family chips (qca6164, qca6174), copy engine 5 is used
for receiving target to host htt messages. In follow up patch, CE5
descriptors will be reused. In such case, same API can not be used as
htc layer callback where the response messages will be freed at the end.
Hence register new API for HTC layer that free up received message and
keep the message handler common for both HTC and HIF layers.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
In follow up patch, htt rx descriptors will be reused instead of
dealloc and refill. To achieve that htt rx indication messages
should not be deferred and should be processed in pci tasklet itself.
Also from rx indication message, mpdu_count alone is used. So it is
maintained as atomic variable and all rx amsdu handlers are done
processed from txrx tasklet. This change get rid of rx_compl_q usage.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Make amsdu handlers (i.e amsdu_pop and rx_h_handler) common to both
rx_ind and frag_ind htt events. It is sufficient to hold rx_ring lock
for amsdu_pop alone and no need to hold it until the packets are
delivered to mac80211. This helps to reduce rx_lock contention as well.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The fw descriptor was never used and probably never will be. It makes
little sense to maintain support for it. Remove it and simplify rx
processing. This will make it easier to optimize rx processing later
as well.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
To optmize CPU usage htt rx descriptors will be reused instead of
refilling it for htt rx copy engine (CE5). To support that all htt rx
indications should be proecssed at same context. Instead of queueing
actual indication message, queue copied message for txrx processing.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
To optimize CPU usage htt rx descriptors will be reused instead of
refilling it for htt rx copy engine (CE5). To support that all htt rx
indications should be processed at same context. FIFO queue is used
to maintain tx completion status for each msdu. This helps to retain
the order of tx completion.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Rx duration support for per station is part of extended peer
stats, enable provision to parse the same and provide backward
compatibility based on the 'stats_id' event
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Add API support for Extended Resource Configuration for 10.4. This
is useful to enable new features like Peer Stats, LTEU etc if the
firmware advertises support for the service. This is also done to
provide backward compatibility with older firmware. Also for clarity
send default host platform type as 'WMI_HOST_PLATFORM_HIGH_PERF',
though this should not make any difference in functionality
Signed-off-by: Raja Mani <rmani@qti.qualcomm.com>
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Provide a debugfs entry to enable/ disable Peer Stats feature.
Peer Stats feature is for developers/users who are more interested
in studying in Rx/Tx stats with multiple clients connected, hence
disable this by default. Enabling this feature by default results
in unneccessary processing of Peer Stats event for every 500ms
and updating peer_stats list (allocating memory) and cleaning it up
ifexceeds the higher limit and this can be an unnecessary overhead
during long run stress testing.
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
qca4019 calibration data is stored in the host memory and it's mandatory
to download it even before reading board id and chip id from the target.
Also, there is a need to execute otp (download and run) twice, one after
cal data download and another one after board data download.
Existing cal data file name 'cal-<bus>-<id>.bin' and device tree entry
'qcom,ath10k-calibration-data' used in ath10k has assumption that it
carries other data (like board data) also along with the calibration data.
But, qca4019 cal data contains pure calibration data (doesn't include
any other info). So, using existing same cal file name and DT entry
in qca4019 case would alter the purpose of it. To avoid this, new cal
file name 'pre-cal-<bus>-<id>.bin' and new device tree entry name
'qcom,ath10k-pre-calibration-data are introduced.
Overall qca4019's firmware download sequence would look like,
1) Download cal data (either from a file or device tree entry)
at the address specified by target in the host interest area
member "hi_board_data".
2) Download otp and run with 0x10 (PARAM_GET_EEPROM_BOARD_ID)
as a argument.
At this point, otp will take back up of downloaded cal data
content in another location in the target and return valid
board id and chip id to the host.
3) Download board data at the address specified by target
in host interest area member "hi_board_data".
4) Download otp and run with 0x10000 (PARAM_FLASH_SECTION_ALL) as
a argument.
Now otp will apply cal data content from it's backup on top
of board data download in step 3 and prepare final data base.
5) Download code swap and athwlan binary content.
Above sequences are implemented (step 1 to step 4) in the name of
pre calibration configuration.
Signed-off-by: Raja Mani <rmani@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
ath10k_download_cal_dt() compares obtained cal data content length
against QCA988X_CAL_DATA_LEN (2116 bytes). It was written by keeping
qca988x in mind. In fact, cal data length is more chip specific.
To make ath10k_download_cal_dt() more generic and reusable for other
chipsets (like qca4019), cal data length is moved to hw_params.
Signed-off-by: Raja Mani <rmani@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Both ath10k_download_cal_file() and ath10k_download_cal_dt() uses
hard coded file pointer (ar->cal_file) and device tree entry
(qcom,ath10k-calibration-data) respectively to get calibration
data content.
There is a need to use those two functions in qca4019 calibration
download sequence with different file pointer and device tree entry name.
Modify those two functions to take cal data location as an argument.
So that it can serve the purpose for other file pointer and device
tree entry.
This is just preparation before adding actual qca4019 calibration
download sequence. No functional changes.
Signed-off-by: Raja Mani <rmani@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
If device failed to init during early probing
(which is quite rare) it triggered driver to
compute crc before ar->firmware was ready causing
an oops.
Fixes: 3e58044b61a9 ("ath10k: print crc32 checksums for firmware and board files")
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This prevents tx hangs or hiccups if pull-push
supporting firmware defines per-txq thresholds or
switches modes dynamically.
Fixes: 299468782d94 ("ath10k: implement wake_tx_queue")
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The wake_tx_queue/push_pending logic had a bug
which could stop queues indefinitely effectivelly
breaking traffic.
Fixes: 299468782d94 ("ath10k: implement wake_tx_queue")
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Results obtained from scan can be used for spectrum management by
doing something like building information of preferred channel
lists and sharing them with stations around. It is to be noted
that traffic to the connected stations would be affected during
the scan.
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
ath10k_core_probe_fw() simply returns error without freeing
cached firmware file content when get board id operation fails.
Free cached fw bin data in failure case to avoid memory leak.
Fixes: db0984e51a ("ath10k: select board data based on BMI chip id and board id")
Signed-off-by: Raja Mani <rmani@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Frames that are transmitted via MGMT_TX are using reserved descriptor
slots in firmware. This limitation is for the htt_mgmt_tx path itself,
not for mgmt frames per se. In 16 MBSSID scenario, these reserved slots
will be easy exhausted due to frequent probe responses. So for 10.4
based solutions, probe responses are limited by a threshold (24).
management tx path is separate for all except tlv based solutions. Since
tlv solutions (qca6174 & qca9377) do not support 16 AP interfaces, it is
safe to move management descriptor limitation check under mgmt_tx
function. Though CPU improvement is negligible, unlikely conditions or
never hit conditions in hot path can be avoided on data transmission.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Whenever firmware is configuring operating channel during scan or
home channel, channel change event will be indicated to host. In some
cases (device probe/ last vdev down), target will be configured to
default channel whereas host is unaware of target's operating channel.
This leads to packet drop due to unknown channel and kernel log will be
filled up with "no channel configured; ignoring frame(s)!". Fix that
by handling HTT_T2H_MSG_TYPE_CHAN_CHANGE event.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Check and set Rx MAC timestamp when firmware indicates it.
Firmware adds it in Rx beacon frame only at this moment.
Driver and mac80211 may utilize it to detect such clockdrift
or beacon collision and use the result for beacon collision
avoidance.
Signed-off-by: Peter Oh <poh@qca.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Until now only WMI originating mgmt frames were
reported to mac80211. Management frames on HTT
were basically dropped (except frames which looked
like management but had FCS error).
To allow sniffing all frames (including offloaded
frames) without interfering with mac80211
operation and states a new rx_flag was introduced
and is not being used to distinguish frames and
classify them for mac80211.
Signed-off-by: Grzegorz Bajorski <grzegorz.bajorski@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The number of HTT Tx descriptors and qcache peer
limit aren't hw-specific. In fact they are
firmware specific and should not be placed in
hw_params.
The QCA4019 limits were submitted with the peer
flow control firmware only and to my understanding
there's no non-peer-flow-ctrl QCA4019 firmware.
However QCA99X0 is planned to run firmware
supporting the feature as well. Therefore this
patch enables QCA99X0 to use 2500 tx descriptors
whenever possible instead of just 1424.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
QCA4019 can queue up to 2500 frames at a time.
This means it requires roughly 5000 entires on the
ring to work properly. Otherwise random tx failure
may occur.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The current/old tx path design was that host, at
its own leisure, pushed tx frames to the device.
For HTT there was ~1000-1400 msdu queue depth.
After reaching that limit the driver would request
mac80211 to stop queues. There was little control
over what packets got in there as far as
DA/RA was considered so it was rather easy to
starve per-station traffic flows.
With MU-MIMO this became a significant problem
because the queue depth was insufficient to buffer
frames from multiple clients (which could have
different signal quality and capabilities) in an
efficient fashion.
Hence the new tx path in 10.4 was introduced: a
pull-push mode.
Firmware and host can share tx queue state via
DMA. The state is logically a 2 dimensional array
addressed via peer_id+tid pair. Each entry is a
counter (either number of bytes or packets. Host
keeps it updated and firmware uses it for
scheduling Tx pull requests to host.
This allows MU-MIMO to become a lot more effective
with 10+ clients.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Firmware 10.4.3 onwards can support a pull-push Tx
model where it shares a Tx queue state with the
host.
The host updates the DMA region it pointed to
during HTT setup whenever number of software
queued from (on host) changes. Based on this
information firmware issues fetch requests to the
host telling the host how many frames from a list
of given stations/tids should be submitted to the
firmware.
The code won't be called because not all
appropriate HTT events are processed yet.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This implements very basic support for software
queueing. It also contains some knobs that will be
patched later.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This merely adds some parsing, generation and
sanity checks with placeholders for real
code/functionality to be added later.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The pull-push functionality of 10.4 will be based
on peer_id and tid. These will need to be mapped,
eventually, to ieee80211_txq to be used with
ieee80211_tx_dequeue().
Iterating over existing stations every time
peer_id needs to be mapped to a station would be
inefficient wrt CPU time.
The new firmware, which will be the only user of
the code flow-wise, will guarantee to use low
peer_ids first so despite peer_map's apparent huge
size d-cache thrashing should not be a problem.
Older firmware hot paths will effectively not use
peer_map.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The 10.4.3 firmware with congestion control
guarantees that each peer has only a single
peer_id mapping.
The 1:1 mapping isn't the case for older firmwares
(e.g. 10.4.1, 10.2, 10.1) but it should not
matter. This 1:1 mapping is going to be only used
by future code which inherently (flow-wise) is for
10.4.3.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Tx pending counter logic assumed that the sk_buff
is already known and hence was performed in HTT
functions themselves.
However, for the sake of future wake_tx_queue()
usage the driver must be able to tell whether it
can submit more frames to firmware before it
dequeues frame from ieee80211_txq (and thus long
before HTT Tx functions are called) because once a
frame is dequeued it cannot be requeud back to
mac80211.
This prepares the driver for future changes.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Some future changes will need to determine final
tx method early on. Prepare the code.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
This prepares the code for future reuse with
ieee80211_txq and wake_tx_queue() in mind.
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
First check for the device state before enabling / disabling
btcoex, also return a proper error value. Enabling / disabling
btcoex ideally does a f/w + ath10k_core_restart so the checks
that are applicable for 'simulate_fw_crash' shall be applicable
for this as well
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
To enable per peer stats feature we are reducing the number of peers.
Firmware has introduced tx stats feature. We have memory limitation in
firmware to add these additional bytes.
These are the new variables introduced in the firmware.
======== =======================
Variable Bytes required/per rate
======== =======================
TX success packets 1
TX failed packets 1
Retry packets 1
Success bytes 2
TX failed bytes 2
Retry bytes 2
Tx duration 4
Rate 1
Bw and AMPDU flags 1
Total 16 (because of allocation in word pattern)
Firmware sends these tx_stats in pktlog.
If we consider 4 feedbacks at a time, Frimware need about ~1K memory for coding
and 8192 bytes required / per rate [ 4*16*128(peers)].
To accommodate this firmware needs to reduce 10 peers.
This fixes a firmware crash with firmware-5.bin_10.2.4.70.22-2.
Signed-off-by: Anilkumar Kolli <akolli@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
On multicore systems, it is possible that txrx tasket can run
in parallel with pci tasklet (i.e smp affinity of ath10k irq is
assigned to multiple CPUs). Feeding and consuming from the same
rx completion list leads to txrx tasklet runs for longer period.
Prevent this by processing a snapshot of rx queue by moving list
into temporary list. Consecutive received frames will be processed
in next batch.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Received frame indications are queued into a skb list and latest
processed by txrx tasklet. This skb queue is protected by htt rx lock.
Since the entire rx processing till delivering frame to mac80211 and
replenish tasks are processed under rx_lock protection, there might be
some delay in queuing newly received rx frame into that list on
multicore systems. Optimize this by using skb list lock while accessing
rx completion queue instead of htt rx lock.
Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
The ath10k_pci_hif_exchange_bmi_msg() function may return the positive
value EIO instead of -EIO in case of error.
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
pktlog data is different between firmware variants (eg. 10.2 vs 10.4). To
have a unified user space script to decode pktlog trace events generated,
it is desirable to know which firmware variant has provided the events and
thereby decode the pktlogs appropriately. Hardware revision (hw_rev) helps
to determine the firmware variant sending these trace events. So add hw_rev
to trace events.
Signed-off-by: Ashok Raj Nagarajan <arnagara@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Currently, we are providing wrong payload data of pktlog to trace points.
Data we receive from FW through copy engine 8 contains pktlog data alone.
We don't need to parse anything in driver before handing it to trace
points.
Fixes: afb0bf7f530b ("ath10k: add support for pktlog in QCA99X0")
Signed-off-by: Ashok Raj Nagarajan <arnagara@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
We periodically receive f/w stats event for updating
the rx duration and there is no reason to keep on appending
the f/w stats peer list, as this gets completely cleaned up when
the user polls for f/w stats {pdev, vdev, peer stats}. Only don't
print the warning message in the case PEER_STATS service is enabled
Fixes: 856e7c3 ("ath10k: add debugfs support for Per STA total rx duration")
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
We are not updating peer stats rx_duration periodically
unless the user one polls for fw_stats, this is because
we discard the update event since pdev list is empty. Fix
this by updating rx duration periodically irrepective of checks
for pdev list (irrespective of ping-pong response)
Fixes: 856e7c3 ("ath10k: add debugfs support for Per STA total rx duration")
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>